ainascan

Since the official launch of Ainascan about one month ago, significant progress has been made on the platform. We have solidified the on-prem Kubernetes infrastructure, added publicly accessible authentication with Keycloak via an in-cloud proxy, major changes to the mobile application adding camera support and uploading algorithms, added thousands of more annotations and data points to the model data, and inventive new training data augmentation techniques to the machine vision models.

On-Prem Infrastructure

In order to save costs, on-premise infrastructure can’t be beat. It depends on the size of your facility and the amount of electricity the computers consume. But since we are utilizing Kubernetes on top of Linux, the installation will be the same if we put it in the cloud. We currently setup two boxes each loaded with 8 CPU cores and 16 GB of RAM and a terabyte of storage on each one. This provides ample and fully dedicated compute load that would cost a lot if it were in the cloud.

In order to expose these boxes to be publicly facing, we needed a way to allow network requests to be proxied instead of directly connecting. This is because once we migrate to the cloud to scale to a larger user base, we will have to have a proxy anyways through a load balancer (preferably). We settled upon creating a virtual network layer with Zerotier. Zerotier provides an extremely simple abstraction that creates peer-to-peer connections to other nodes on the network as if they were on a local LAN. So, with our small proxy instance in AWS, we can bridge them together very simply. We could have used AWS services to create VPN direct connections. But, with Zerotier’s free tier advantages, we couldn’t help ourselves. Once we move to the cloud, all we will have to adjust is the Kubernetes installation, DNS records, and utilize AWS VPC for private networking.

Authentication With Keycloak

Since our service will be available to the public, we needed an identity and access management solution built on top of industry security standard protocols. In the principals of the project, the choice of Keycloak was a no brainer. Keycloak is an open-source CNCF project that adds authentication to our applications and services. Keycloak is our self-hosted identity provider so we don’t have to rely on Google OAuth, or pay for an external service. Customers can come directly to us, create an account, and have all the conveniences of refresh tokens and cross-application logins, right out of the box with minimal configuration.

Mobile Application Development

The core development team knows all too well how difficult it is to convince a farmer to utilize technology while in the field. We have direct experience working with farmers in the ocean, on small and large land-based farms, and scientists and researchers. Our application is meant to hide all the unnecessary options and complexity and show only the options that we need them to see: how to take a photo and upload that photo. We are building the application with minimal complex animations and styling to provide blazing fast support with clear contextual understanding of what is desired from the user.

In addition, we decided to refrain from writing the app in any cross-platform framework like React-Native or Flutter. Our mobile app is extremely simple in principle and we don’t need to write native Java or Swift either. Instead, we are writing our application in React.js and exposing it like any other web application through a web browser. This way, a user can access the application on a mobile browser (with a tablet or phone), or via a web view component in our React-Native mobile app. It gives cross-platform support right out of the box while allowing us to continue working with HTML, CSS, and Javascript natively.

Model Developments

Our artificial intelligence keeps getting better and better as we collect more data and use the latest and greatest data annotation and augmentation techniques.

Our dataset of Arabica coffee now spans multiple countries (Panama and Hawaii, USA), multiple locations across the Kona coastline from Kona to Milolii, from new sprouting trees to mature ones, high elevation in the cloud forest to lower elevation in the dryer parts, with organic and non-organic practices, in nurseries and open field conditions, and with many instances of diseases and defects like funguses, algaes, insect infections, and most importantly leaf rust. And we are still not done! We are reaching out to our connections to gather more data between the gaps on the Kona coastline and to the Hilo side in Pahala. Further, we intend to extend our reach to connections in Vietnam where we hope to gather a diverse dataset of Robusta coffee.

Utilizing the famous Segment Anything model, we have reduced the time needed to annotate allowing us to do a lot of annotations in-house. For leaf annotations, the SAM model’s zero-shot accuracy is pretty good! Although, it struggles often in detecting the tips of the leaf or if there is too much noise it almost fails completely. For these detections, we want to detect many leaf or berry instances per image (on the trend of 50 or more per image). Since the computer vision model needs to learn to recognize that there are many instances per image, this leads to many thousands and thousands of individual annotations required. To combat this, we have had to get inventive in our data augmentation strategies.

For leaves on a tree, they can pretty much come in any rotation and orientation that you can think of. Combine that with the angle that the user will take the picture and the added complexity of the lighting conditions that Hawaii provides (cloudy, foggy, rainy, sunny, etc), this creates an extremely wide variety of contextual backgrounds. The CUT-AND-PASTE data augmentation strategy provides a promising option given the problem set. But, most implementations seem to only assume that there is a single instance in the frame. To solve this we created the phy_cut_paste open-sourced library that drops many contours into a physics simulation and lets them become jumbled around! This provides rotation and translation of the contours, on any contextual background desired, while preventing overlaps. So far, the initial results are promising. But there is much more testing to be done to see how far the mAP can be improved.

Conclusion

For the first month of development, we have made enormous strides towards a usable product. We believe we are still on track to release an alpha version within 1-2 months. But it couldn’t have been done without the help of our local community. We want to personally thank:

Luis Aristizabal for his scientific input and connections that allowed us to learn more about how diseases are spreading and impacting farming practices. Luis has several scientific publishings that helped expand our knowledge.

Integrated Pest Management of Coffee Berry Borer: Strategies from Latin America that Could Be Useful for Coffee Farmers in Hawaii

Monitoring Coffee Leaf Rust (Hemileia vastatrix) on Commercial Coffee Farms in Hawaii: Early Insights from the First Year of Disease Incursion

Mark Santiago and the rest of the team at Bay View Coffee Farm for educating us on farming and harvesting practices. His wisdom and experience further improved our knowledge of day to day practices.

Bay View Farms Coffee

The Mountain Thunder Coffee Plantation for providing access to their fields and unique locations in the cloud forest.

Mountain Thunder Coffee Plantation