Posted on Leave a comment

Nithya Ruff on Open Source Contributions Beyond Code

Sometimes when we think about open source, we focus on the code and forget that there are other equally important ways to contribute. Nithya Ruff, Senior Director, Open Source Practice at Comcast, knows that contributions can come in many forms. “Contribution can come in the form of code or in the form of a financial support for projects. It also comes in the form of evangelizing open source; It comes in form of sharing good practices with others,” she said.

Comcast, however, does contribute code. When I sat down with Ruff at Open Source Summit to learn more, she made it clear that Comcast isn’t just a consumer; it contributes a great deal to open source. “One way we contribute is that when we consume a project and a fix or enhancement is needed, we fix it and contribute back.” The company has made roughly 150 such contributions this year alone.

Comcast also releases its own software as open source. “We have created things internally to solve our own problems, but we realized they could solve someone else’s problem, too. So, we released such internal projects as open source,” said Ruff.

Watch the video interview at The Linux Foundation

Posted on Leave a comment

How to Deploy Hyperledger Fabric on Kubernetes – Part II

We recently hosted a webinar about deploying Hyperledger Fabric on Kubernetes. It was taught by Alejandro (Sasha) Vicente Grabovetsky and Nicola Paoli from AID:Tech.

The webinar contained a detailed, step-by-step instruction showing exactly how to deploy Hyperledger Fabric on Kubernetes. For those who prefer reading to watching, we have prepared a condensed transcript with screenshots that will take you through the process that has been adapted to recent updates in the Helm charts for the Orderers and Peers.

Are you ready? Let’s dive in!

What we will build

First, we will deploy a Fabric Certificate Authority (CA) serviced by a PostgreSQL database for managing identities.

  • Fabric Orderer

Then, we will deploy an ordering service of several Fabric ordering nodes communicating and establishing consensus over an Apache Kafka cluster. The Fabric Ordering service provides consensus for development (solo) and production (Kafka) networks.

  • Fabric Peer

Finally, we will deploy several Peers and connect them with a channel. We will bind them to a CouchDB database.

Read more at Hyperledger

Posted on Leave a comment

EBBR Aims to Standardize Embedded Boot Process

Arm’s open source EBBR (Embedded Base Boot Requirements) specification is heading for its v1.0 release in December. Within a year or two, the loosely defined EBBR standard should make it easier for Linux distros to support standardized bootup on major embedded hardware platforms.

At the recent Embedded Linux Conference Europe, Grant Likely, a Linux kernel engineer who works at Arm, explained the basics of EBBR and why we need it. Previous ELC talks by Likely include a primer on hardware hacking basics for software developers.

EBBR is not a new technology, but rather a requirements document based on existing standards that defines firmware behavior for embedded systems. Its goal is to ensure interoperability between embedded hardware, firmware projects, and the OS.

The spec is based largely on the desktop-oriented Unified Extensible Firmware Interface (UEFI) spec and incorporates work that was already underway in the U-Boot community. EBBR is designed initially for Arm Linux boards loaded with the industry standard U-Boot or UEFI’s TianoCore bootloaders. EBBR also draws upon Arm’s Linaro-hosted Trusted Firmware-A project, which supplies a preferred reference implementation of Arm specifications for easier porting of firmware to modern hardware.

EBBR is currently “working fine” on the Raspberry Pi and has been successfully tested on several Linux hacker boards, including the Ultra96 and MacchiatoBIN, said Likely. Despite the Arm Linux focus, EBBR is already in the process of supporting multiple hardware and OS architectures. The FreeBSD project has joined the effort, and the RISC-V project has shown interest. Additional bootloaders will also be supported.

Why EBBR now?

The UEFI standard emerged when desktop software vendors struggled to support a growing number of PC platforms. Yet, it never translated well to embedded, which lacks the uniformity of the PC platform, as well as the “economic incentives toward standardization that work on the PC level,” said Likely. Unlike the desktop world, a single embedded firm typically develops a custom software stack to run on a custom-built device “all bound together.”

In recent years, however, several trends have pushed the industry toward a greater interest in embedded standards like EBBR. “We have a lot of SBCs now, and embedded software is getting more complicated,” said Likely. “Fifteen years ago, we could probably get by with the Linux kernel, BusyBox, and a bit of custom code on top. But with IoT, we’re expected to have things like network stacks, secure updates, and revision control. You might have multiple hardware platforms to support.”

As a result, there’s a growing interest in using pre-canned distros to offload the growing burden of software maintenance. “There’s starting to be an economic incentive to boot an embedded system on all the major distros and all the major SBCs,” said Likely. “We’ve been duplicating a lot of the same boot setup work.”

The problem is that bootloaders like U-Boot behave differently on different hardware, “with slightly different bootstrips and Device Tree setup on each board,” he continued. “It’s impossible for Linux distros to support more than a couple of boards. The growing number of board-specific images also poses problems. Not only is it a problem of the boot flow — getting the firmware on the system into the OS — but of how to tell the OS what’s on the board.”

As the maintainer of the Linux Device Tree, Likely is an expert on the latter issue. “Device Tree gives us the separation of the board description and the OS, but the next step is to make that part of the platform,” he explained. “So far we haven’t had a standard pre-boot environment in embedded. But if you had a standard boot flow and the machine could describe itself to the OS, the distros would be able to support it. They wouldn’t need to create a custom kernel, and they could do kernel and security updates in a standard way.”

Some embedded developers have expressed skepticism about EBBR’s dependence on the “bloated” UEFI code, with fears that it will slow down boot time. Yet, Likely claimed that the EBBR implementation adds “insignificant” overhead.

EBBR also differs from UEFI in that it’s written with consideration of embedded constraints and real-world practices. “Standard UEFI says if your storage is on eMMC, then firmware can’t be loaded there, but EBBR is flexible enough to account for doing both on same media,” said Likely. “We support limited hardware access at runtime and limited variable access to deal with things like the mechanics of device partitioning.” In addition, the spec is flexible enough that “you can still do your custom thing within the standard’s framework without breaking the flow.”

EBBR v1.0 and beyond

When Arm initially released its initial universal boot proposal “nobody was interested,” said Likely. The company returned with a second EBBR proposal that was launched as an open source project with a CC-BY-SA license and a GitHub page. Major Linux distro projects started taking interest.

Arm is counting on the distro projects to pressure semiconductor and board-makers to get onboard. Already, several chipmakers including ST, TI, and Xilinx have shown interest.  “Any board that is supported in U-Boot mainline should work,” said Likely. “Distros will start insisting on it, and it will probably be a requirement in a year or two.”

The upcoming v1.0 release will be available in server images for Fedora, SUSE, Debian, and FreeBSD that will boot unmodified on mainline U-Boot. The spec initially runs on 32- and 64-bit Arm devices and supports both ACPI power management and Linux Device Tree. It requires Arm’s PSCI (Power State Coordination Interface) technology on 64-bit Arm devices. The v1.0 spec provides storage guidance and runtime services guidance, and it may include a QEMU model.

Future releases will look at secure boot, capsule updates, more embedded use cases, better UEFI compliance, and improved non-Linux representation, said Likely. Other goals include security features and a standard testing platform.

“These are all solvable problems,” said Likely. “There are no technical barriers to boot standardization.”

You can watch the complete presentation here:

[youtube https://www.youtube.com/watch?v=Zz5wGjY9VpU]

Posted on Leave a comment

Machine Learning, Biased Models, and Finding the Truth

Machine learning and statistics are playing a pivotal role in finding the truth in human rights cases around the world – and serving as a voice for victims, Patrick Ball, director of Research for the Human Rights Data Analysis Group, told the audience at Open Source Summit Europe.

Ball began his keynote, “Digital Echoes: Understanding Mass Violence with Data and Statistics,” with background on his career, which started in 1991 in El Salvador, building databases. While working with truth commissions from El Salvador to South Africa to East Timor, with international criminal tribunals as well as local groups searching for lost family members, he said, “one of the things that we work with every single time is trying to figure out what the truth means.”

In the course of the work, “we’re always facing people who apologize for mass violence. They tell us grotesque lies that they use to attempt to excuse this violence. They deny that it happened. They blame the victims. This is common, of course, in our world today.”

Human rights campaigns “speak with the moral voice of the victims,’’ he said. Therefore, it is critical that statistics, including machine learning, are accurate, Ball said.

He gave three examples of when statistics and machine learning proved to be useful, and where they failed.

Learn more and watch the complete presentation at The Linux Foundation

Posted on Leave a comment

cregit: Token-Level Blame Information for the Linux Kernel

Who wrote this code? Why? What changes lead to this function’s current implementation?

These are typical questions that developers (and sometimes lawyers) ask during their work. Most software development projects use version control software (such as Git or Subversion) to track changes and use the “blame” feature of these systems to answer these questions.

Unfortunately, version control systems are only capable of tracking full lines of code. Imagine the following scenario: A simple file is created by developer A; later, it is changed by Developer B, and finally, by Developer C. The following figure depicts the contents of the files after each modification. The source code has been colored according to the developer who introduced it (blue for Developer A, green for Developer B, and red for Developer C; note that Developer B only changed whitespace –including merging some lines).

Blame tracks lines not tokens

If we were to use git to track these changes, and use git-blame with default parameters, its output will show that Developer B and C are mostly responsible for the contents of the file. However, if we were to instruct blame to ignore changes to whitespace, the results would be:

In general, one would expect to always ask blame to ignore whitespace. Unfortunately, this is not always possible (such as the “blame” view of GitHub, which is computed without ignoring whitespace).

Note that, even if we run blame with the ignore-whitespace option, the “blame” is incorrect. First, lines merged or split are not addressed properly by blame (the ignore-whitespace option does not ignore them). Second, lines that were mostly authored by Developer A are now assigned to Developer C because she was the last one to modify them.

If we consider the token as the indivisible unit of source code (i.e., a token cannot be modified, it can only be removed or inserted), then what we really want is to know who is responsible for introducing each token to the source code base. A blame-per-token for the file in our example would look like the figure below. Note how it correctly shows that the only changes made by C to the source code were the replacement of int with long in three places, and that B made no changes to the code:

cregit: improving blame of source code

We created cregit to do exactly this. The goal of cregit is to provide token-level blame for a software system whose history has been recorded using git. The details of cregit’s implementation can be found in this Working Paper (currently under review).

We have empirically evaluated cregit on several mature software systems. In our experiments, we found that blame-per-line tends to be accurate between 70% and 80% of the time. This highly depends on how much the code has been modified. The more modifications to existing code, the less likely that blame-per-line will be accurate. Cregit on the other hand is able to increase this accuracy to 95% (please see the paper mentioned above for details).

For the last two years, we have been running cregit on the source code of the Linux kernel. The results can be found at: https://cregit.linuxsources.org/code/4.19/.

Blame-per-line is easy to implement, just put the blame information to the side; however, blame-per-token is significantly more complex, as its tokens might have different authors and/or commits responsible for them. Hence, we are currently rolling out an improved view of blame-per-token for kernel release 4.19 of Linux (older versions use an old view, and most of the information here does not apply).

cregit views: inspecting who changed what/when

Below is an example of the blame-per-token views of Linux 4.19, specifically for the file audit.c.html.

The top part gives us an overview of who the authors of the file are. The first 50 authors are individually colored. The source code is colored according to the person who last added the token. The right-hand side of the view shows an overview of the “ownership” of the source code.

While hovering over the source code, you will see a box displaying information about how that token got into the source code: the commit id, its author, and its commit timestamp and summary. If you click on the token, this information is enhanced with a link to the email thread that corresponds to the code review of the commit that inserted that token, as shown below:

The views are highly interactive. For example, one can select to highlight a commit (top middle combo box). In this case, all the code is grayed out, except for the tokens that were added by that commit, as shown below.

You can also click on an author’s name, and only that author’s code will be highlighted. For example, in the image below I have highlighted Eric Paris’s contributions.

cregit is also capable of highlighting the age of the code. The sliding bar at the top right allows to narrow the period of interest. Below I have selected to show changes during the last two years (note that the file was last modified in July 17, 2018.

It is also possible to focus on a specific function, which can be selected with the Functions combo box at the top of the source code. In the example below I have selected the function audit_set_failure. The rest of the code has been hidden.

These features can be easily combined. You can select the age of the code by a specific author. And narrow it to a given function!

cregit views: improving the linkage of email code reviews

We are going to keep expanding the information shown in the commit panel. Currently, in addition to the metadata of the commit that is responsible for the token, it provides hyperlinks to the commit patch, and to any email discussions we have been able to find regarding this commit. We are working to match more and more commits.

cregit: where to get it

cregit is open source, and is accessible from https://github.com/cregit/cregit. It is capable of processing C, C++, Java, and go. We can probably add support for perl and python fairly easily. All we need to support a new language is a tokenizer.

cregit’s input is a git repository, and its output is another git repository that tracks the source code by token (see paper for details). From this repository we construct the blame views shown above. If you are interested to have your repository processed with cregit, email me.

Finally, I would like to acknowledge several people for their contributions:

  • Bram Adams. Bram and I are the creators of cregit.
  • Jason Lim. As part of his coursework at UVic he implemented the new cregit views, which have greatly improved their usefulness.
  • Alex Courouble. As part of his master’s at the Poly of Montreal he implemented the matching algorithms of commits to email discussions, based on earlier work of Yujuan Jiang during her PhD.
  • Kate Stewart. She has been instrumental to gather user requirements and to evaluate cregit and its views.
  • Isabella Ferreira. She is picking up where Alex left and continues to improve the matching of commits to emails.

This article was written by Daniel German (dmg@turingmachine.org) and originally appeared on GitHub.

Posted on Leave a comment

Get Cyber Monday Savings on Linux Foundation Training and Certification

It’s time for our biggest sale of the year. The Linux Foundation’s annual Cyber Monday event means you can get trained and get certified at a huge discount.

And, you’ll get a free T-shirt with every purchase!

Through the limited-time Cyber Monday training sale, we’re offering prep course and certification exam bundles for just $179.

This offer includes the prep course and exam for the following certification options:

  • Linux Foundation Certified Engineer (LFCE) This option is designed for the Linux engineer looking to demonstrate a more advanced level of Linux administration and engineering skill.

  • Cloud Foundry Certified Developer (CFCD)This program will verify your expertise in using the Cloud Foundry platform and building cloud-native applications

  • Certified Kubernetes Administrator (CKA)This program assures you have the skills and knowledge to perform the responsibilities of Kubernetes administrator.

  • Certified Kubernetes Application Developer (CKAD) This option certifies that you can design, build, configure, and expose cloud native applications for Kubernetes.

  • Certified OpenStack Administrator (COA) This program provides essential OpenStack and cloud infrastructure skills.

Sign up now to take advantage of this special training offer from The Linux Foundation.

Posted on Leave a comment

Why Should You Use Microservices and Containers?

What to expect when you’re working with microservices and containers.

First of all, what are microservices? Microservices is a type of architecture that splits your application into multiple services that does a fine-grained function that’s a part of your application as a whole. Each of your microservices will have a different logical function for your application. Microservices is a more modern approach in an application’s architecture compared to a monolithic architecture where all your application’s components and functions are in a single instance. You can refer to a comparison of a monolithic to a microservices architecture on the diagram below.

Monoliths vs microservices

Read more at IBM Developer

Click Here!

Posted on Leave a comment

Three SSH GUI Tools for Linux

At some point in your career as a Linux administrator, you’re going to use Secure Shell (SSH) to remote into a Linux server or desktop. Chances are, you already have. In some instances, you’ll be SSH’ing into multiple Linux servers at once. In fact, Secure Shell might well be one of the most-used tools in your Linux toolbox. Because of this, you’ll want to make the experience as efficient as possible. For many admins, nothing is as efficient as the command line. However, there are users out there who do prefer a GUI tool, especially when working from a desktop machine to remote into and work on a server.

If you happen to prefer a good GUI tool, you’ll be happy to know there are a couple of outstanding graphical tools for SSH on Linux. Couple that with a unique terminal window that allows you to remote into multiple machines from the same window, and you have everything you need to work efficiently. Let’s take a look at these three tools and find out if one (or more) of them is perfectly apt to meet your needs.

I’ll be demonstrating these tools on Elementary OS, but they are all available for most major distributions.

PuTTY

Anyone that’s been around long enough knows about PuTTY. In fact, PuTTY is the de facto standard tool for connecting, via SSH, to Linux servers from the Windows environment. But PuTTY isn’t just for Windows. In fact, from withing the standard repositories, PuTTY can also be installed on Linux. PuTTY’s feature list includes:

  • Saved sessions.

  • Connect via IP address or hostname.

  • Define alternative SSH port.

  • Connection type definition.

  • Logging.

  • Options for keyboard, bell, appearance, connection, and more.

  • Local and remote tunnel configuration

  • Proxy support

  • X11 tunneling support

The PuTTY GUI is mostly a way to save SSH sessions, so it’s easier to manage all of those various Linux servers and desktops you need to constantly remote into and out of. Once you’ve connected, from PuTTY to the Linux server, you will have a terminal window in which to work. At this point, you may be asking yourself, why not just work from the terminal window? For some, the convenience of saving sessions does make PuTTY worth using.

Installing PuTTY on Linux is simple. For example, you could issue the command on a Debian-based distribution:

sudo apt-get install -y putty

Once installed, you can either run the PuTTY GUI from your desktop menu or issue the command putty. In the PuTTY Configuration window (Figure 1), type the hostname or IP address in the HostName (or IP address) section, configure the port (if not the default 22), select SSH from the connection type, and click Open.

Once the connection is made, you’ll then be prompted for the user credentials on the remote server (Figure 2).

To save a session (so you don’t have to always type the remote server information), fill out the IP address (or hostname), configure the port and connection type, and then (before you click Open), type a name for the connection in the top text area of the Saved Sessions section, and click Save. This will then save the configuration for the session. To then connect to a saved session, select it from the saved sessions window, click Load, and then click Open. You should then be prompted for the remote credentials on the remote server.

EasySSH

Although EasySSH doesn’t offer the amount of configuration options found in PuTTY, it’s (as the name implies) incredibly easy to use. One of the best features of EasySSH is that it offers a tabbed interface, so you can have multiple SSH connections open and quickly switch between them. Other EasySSH features include:

Install EasySSH on a Linux desktop is simple, as the app can be installed via flatpak (which does mean you must have Flatpak installed on your system). Once flatpak is installed, add EasySSH with the commands:

sudo flatpak remote-add --if-not-exists flathub https://flathub.org/repo/flathub.flatpakrepo sudo flatpak install flathub com.github.muriloventuroso.easyssh

Run EasySSH with the command:

flatpak run com.github.muriloventuroso.easyssh

The EasySSH app will open, where you can click the + button in the upper left corner. In the resulting window (Figure 3), configure your SSH connection as required.

Once you’ve added the connection, it will appear in the left navigation of the main window (Figure 4).

To connect to a remote server in EasySSH, select it from the left navigation and then click the Connect button (Figure 5).

The one caveat with EasySSH is that you must save the username and password in the connection configuration (otherwise the connection will fail). This means anyone with access to the desktop running EasySSH can remote into your servers without knowing the passwords. Because of this, you must always remember to lock your desktop screen any time you are away (and make sure to use a strong password). The last thing you want is to have a server vulnerable to unwanted logins.

Terminator

Terminator is not actually an SSH GUI. Instead, Terminator functions as a single window that allows you to run multiple terminals (and even groups of terminals) at once. Effectively you can open Terminator, split the window vertical and horizontally (until you have all the terminals you want), and then connect to all of your remote Linux servers by way of the standard SSH command (Figure 6).

To install Terminator, issue a command like:

sudo apt-get install -y terminator

Once installed, open the tool either from your desktop menu or from the command terminator. With the window opened, you can right-click inside Terminator and select either Split Horizontally or Split Vertically. Continue splitting the terminal until you have exactly the number of terminals you need, and then start remoting into those servers.
The caveat to using Terminator is that it is not a standard SSH GUI tool, in that it won’t save your sessions or give you quick access to those servers. In other words, you will always have to manually log into your remote Linux servers. However, being able to see your remote Secure Shell sessions side by side does make administering multiple remote machines quite a bit easier.

Few (But Worthwhile) Options

There aren’t a lot of SSH GUI tools available for Linux. Why? Because most administrators prefer to simply open a terminal window and use the standard command-line tools to remotely access their servers. However, if you have a need for a GUI tool, you have two solid options and one terminal that makes logging into multiple machines slightly easier. Although there are only a few options for those looking for an SSH GUI tool, those that are available are certainly worth your time. Give one of these a try and see for yourself.

Posted on Leave a comment

A Closer Look at Voice-Assisted Speakers

U.S. consumers are expected to drop a bundle this Black Friday on smart speakers and home hubs. A Nov. 15 Canalys report estimates that shipments of voice-assisted speakers grew 137 percent in Q3 2018 year-to-year and are on the way to 75 million-unit sales in 2018. At the recent Embedded Linux Conference and Open IoT Summit in Edinburgh, embedded Linux developer and Raspberry Pi HAT creator Leon Anavi of the Konsulko Group reported on the latest smart speaker trends.

As Anavi noted in his “Comparison of Voice Assistant SDKs for Embedded Linux Devices” talk, conversing with computers became a staple of science fiction over half a century ago. Voice technology is interesting “because it combines AI, big data, IoT, and application development,” said Anavi.

In Q3 2017, Amazon and Google owned the industry with 74.7 percent and 24.6 percent, respectively, said Canalys. A year later, the percentages were down to 31.9 and 29.8. China-based Alibaba and Xiaomi almost equally split another 21.8 percent share, followed by 17.4 percent for “others,” which mostly use Amazon Alexis, and increasingly, Google Assistant.

Despite the success of the mostly Linux-driven smart speaker market, Linux application developers have not jumped into voice app development in the numbers one might expect. In part, this is due to reservations about Google and Amazon privacy safeguards, as well as the proprietary nature of the hardware and cloud software.

“Privacy is a concern with smart speakers,” said Anavi. “You can’t fully trust a corporation if the product is not open source.”

Anavi summarized the Google and Amazon SDKs but spent more time on the fully open source Mycroft Mark. Although Anavi clearly prefers Mycroft, he encouraged developers to investigate all the platforms. “There is a huge demand in the market for these devices and a lot of opportunity for IoT integration, from writing new skills to integrating voice assistants in consumer electronics devices,” said Anavi.

Alexa/Echo

Amazon’s Alexa debuted in the Echo smart speaker four years ago. Amazon has since expanded to the Echo branded Dot, Spot, Tap, and Plus speakers, as well as the Echo Show and new Echo Show 2 display hubs.

The market leading Echo devices run on Amazon’s Linux- and Android-based Fire OS. The original Echo and Dot ran on the Cortex-A8-based TI DM3725 SoC while more recent devices have moved to an Armv8 MediaTek MT8163V SoC with 256MB RAM and 4GB flash.

Thanks to Amazon’s wise decision to release an Apache 2.0 licensed Alexa Voice Services (AVS) SDK, Alexa also runs on most third-party hubs. The SDK includes an Alexa Skills Kit for creating custom Skills. The cloud platform required to make Alexa devices work is not open source, however, and commercial vendors must sign an agreement and undergo a certification process.

Alexa runs on a variety of hardware including the Raspberry Pi, as well as smart devices ranging from the Ecobee4 Smart Thermostat to the LG Hub Robot. Microsoft recently began selling Echo devices, and earlier this year partnered with Amazon to integrate Alexa with its own Cortana voice agent in devices. This week, Microsoft announced that users can voice-activate Skype calls via Alexa on Echo devices.

Google Assistant/Home

The Google Assistant voice agent debuted on the Google Home smart speaker in 2016. It has since expanded to the Echo Dot-like Home Mini, which like the Home runs on a 1.2GHz dual-core Cortex-A7 Marvell Armada 1500 Mini Plus with 512MB RAM and 4GB flash. This year’s Home Max offered improved speakers and advanced to a 1.5GHz, quad-core Cortex-A53 processor. More recently, Google launched the touchscreen enabled Google Home Hub.

The Google Home devices run on a version of the Linux-based Google Cast OS. Like Alexa, the Python driven Google Assistant SDK lets you add the voice agent to third-party devices. However, it’s still in preview stage and lacks an open source license. Developers can create applications with Google Actions.

Last year, Google launched a version of its Google Assistant SDK for the Raspberry Pi 3 and began selling an AIY Voice Kit</a> that runs on the Pi. There’s also a kit that runs on the Orange Pi, said Anavi.

This year, Google has aggressively courted hardware partners to produce home hub devices that combine Assistant with Google’s proprietary Android Things. The devices run on a variety of Arm-based SoCs led by the Qualcomm SD212 Home Hub Platform.

The SDK expansion has resulted in a variety of third-party devices running Assistant, including the Lenovo Smart Display and the just released LG XBOOM AI ThinQ WK9 touchscreen hubs. Sales of Google Home devices outpaced Echo earlier this year, although Amazon regained the lead in Q3, says Canalys.

Like Alexa, but unlike Mycroft, Google Assistant offers multilingual support. The latest version supports follow-up questions without having to repeat the activation word, and there’s a voice match feature that can recognize up to six users. A new Google Duplex feature accomplishes real-world tasks through natural phone conversations.

Mycroft/Mark

Anavi’s favorite smart speaker is the Linux-driven, open source (Apache 2.0 and CERN) Mycroft. The Raspberry Pi based Mycroft Mark 1 speaker was certified by the Open Source Hardware Association (OSHA).

The Mycroft Mark II launched on Kickstarter in January and has received $450,000 in funding. This Xilinx Zynq UltraScale+ MPSoC driven home hub integrates Aaware’s far-field Sound Capture technology. A Nov. 15 update post revealed that the Mark II will miss its December ship date.

Kansas City-based Mycroft has raised $2.5 million from institutional investors and is now seeking funding on StartEngine. Mycroft sees itself as a software company and is encouraging other companies to build the Mycroft Core platform and Mycroft AI voice agent into products. The company offers an enterprise server license to corporate customers for $1,500 a month, and there’s a free, Raspbian based Picroft application for the Raspberry Pi. A Picroft hardware kit is under consideration.

Mycroft promises that user data will never be saved without an opt-in (to improve machine learning algorithms), and that it will never be used for marketing purposes. Like Alexa and Assistant, however, it’s not available offline without a cloud service, a feature that would better ensure privacy. Anavi says the company is working on an offline option.

The Mycroft AI agent is enabled via a Python based Mycroft Pulse SDK, and a Mycroft Skills Manager is available for Skills development. Like Alexa and Assistant, Mycroft supports custom wake words. The new version uses its homegrown Precise wake-word listener technology in place of the earlier PocketSphinx. There’s also an optional device and account management stack called Mycroft Home.

For text-to-speech (TTS), Mycroft defaults to the open source Mimic, which is co-developed with VocaliD. It also supports eSpeak, MaryTTS, Google TTS, and FATTS.

Mycroft lacks its own speech to-text (STT) engine, which Anavi calls “the biggest challenge for an open source voice assistant.” Instead, it defaults to Google STT and supports IBM Watson STT and wit.ai.

Mycroft is collaborating with Mozilla on its open source DeepSpeech STT, an open source TensorFlow implementation of Baidu’s DeepSpeech platform. Baidu trails Alibaba and Xiaomi in the Chinese voice assistant market but is one of the fastest growing voice AI companies. Just as Alibaba uses its homegrown, Alexa-like AliGenie agent on its Tmall Genie speaker, Baidu loads its speakers with its DeepSpeech-driven DuerOS voice platform. Xiaomi has used Alexa and Cortana.

Mycroft is the most mature of several alternative voice AI projects that promise improved privacy safeguards. A recent VentureBeat article reported on emerging privacy-oriented technologies including Snips and SoundHound.

Anavi concluded with some demo videos showing off his soothing, Bulgarian AI whisperer vocal style. “I try to be polite with these things,” said Anavi. “Someday they may rule the world and I want to survive.”

Anavi’s video presentation can be seen here:

[embedded content]

Posted on Leave a comment

Home Assistant: The Python Approach to Home Automation

A number of home automation platforms support Python as an extension, but if you’re a real Python fiend, you’ll probably want Home Assistant, which places the programming language front and center. Paulus Schoutsen created Home Assistant in 2013 “as a simple script to turn on the lights when the sun was setting,” as he told attendees of his recent Embedded Linux Conference and Open IoT conference presentation. (You can watch the complete video below.)

Schoutsen, who works as a senior software engineer for AppFolio in San Diego, has attracted 20 active contributors to the project. Home Assistant is now fairly mature, with updates every two weeks and support for more than 240 different smart devices and services. The open source (MIT license) software runs on anything that can run Python 3, from desktop PCs to a Raspberry Pi, and counts thousands of users around the world.

Like most automation systems, Home Assistant offers mobile and desktop browser clients to control smart home devices from afar. It differs from most commercial offerings, however, in that it has no hub appliance, which means there are no built-in radios. You can add the precisely those radios you want, however, using USB sticks. There’s also no cloud component, but Schoutsen argues that any functionality you might sacrifice because of this is more than matched by better security, privacy, and resiliency.

“There is no dependency on a cloud provider,” said Schoutsen. “Even when the Internet goes down, the home doesn’t shut down, and your very private data stays in your home.”

Schoutsen did not offer much of a promo in his presentation, but quickly set to work explaining how the platform works. Since Home Assistant is not radically different from other IoT frameworks — one reason why it interfaces easily with platforms ranging from Nest to Arduino to Kodi — the presentation is a useful introduction to IoT concepts.

To get a better sense of Home Assistant’s strengths, I recently asked Schoutsen for his elevator pitch. He highlighted the free, open source nature of the software, as well as the privacy and security of a local solution. He also noted the ease of setup and discovery, and the strength of the underlying Python language.

Easy Extensions

“Python makes it very easy to extend the system,” Schoutsen told me. “As a dynamic language it allows a flexibility that Java developers can only dream off. It is very easy to test out and prototype new pieces on an existing installation without breaking things permanently. With the recent introduction of MicroPython, which runs on embedded systems as Arduino and ESP8266, we can offer a single language for all levels of IoT: from sensors to automation to integration with third-party services.”

In Schoutsen’s ELC 2016 presentation, he described how Home Assistant is an event-driven program that incorporates a state machine that keeps track of “entities” — all the selected devices and people you want to track. Each entity has an identifier, a state condition, and attributes. The latter describes more about the state, such as the color and intensity of the light on a Philips Hue smart bulb.

To integrate a Philips Hue into the system, for example, you would need to use a light “component,” which is aware of the bulb and how to read its state (off or on). Home Assistant offers components for every supported device or service, as well as easy access to component groups such as lights, thermostats, switches, and garage doors. Setup is eased with a network discovery component that scans the network and, if you have a supported device, sets it up automatically. 

The software is further equipped with a service registry, which provides services over the event bus. “We can register the turn-on command for a light, and have it send an email or SMS,” said Schoutsen. “A timer can send a time change event every second, and a component can ask to be notified at a particular time, or in intervals. Based on time change events, it will trigger the callback of the components.”

Each component writes its state to the state machine, emitting a state change event to the event bus. The light component would register its turn on service inside the service registry so that anyone could fire an event to the event bus to turn on the light,” said Schoutsen.

You can easily integrate a light component with a motion detector component using an automation component. This would listen to the motion detector events, and fire a “turn light on” event to the event bus, which in turn would be forwarded to the service registry. The registry would then check to see that the light component can handle the event. “Automation components can listen for events, observe certain attribute states or triggers, and act on them,” explained Schoutsen.

Another component type handles presence detection. The platform can check the router to see which phones are connected in order to see who is home,” said Schoutsen. “Other components are responsible for recording event and state history, or for entity organization — grouping multiple entities and summarizing their state.” Components are available for integrating third party services, such as MQTT or IFTTT, and other components export data to external databases and analysis tools.

Schoutsen went on to explain concepts such as a “platform” layer that sits above the entity components. Each platform integrates an “abstract base class,” which “acts as the glue between the real device and the one represented in Home Assistant,” said Schoutsen. Later, he ran through a code example for a basic switch and explored the use of trigger zones for geofencing.

As Schoutsen says, Home Assistant is “gaining a lot of traction.” Check out the complete video to see what happens when Python meets IoT.

[embedded content]