Last week, at Landing AI, we publicly launched our flagship AI platform, LandingLens. This all-in-one platform empowers users to build a computer vision application from start to deployment. You can try it for free here. Thousands of users worldwide have already leveraged this platform, showcasing interesting use cases such as traffic prediction with drones, real-time action detection in sports, and numerous possibilities in manufacturing and healthcare.
I have been participating in the creation of this platform since day zero. As a builder myself, it has been an exciting and rewarding journey to create an AI platform from scratch. In this blog, I want to share the motivation behind building this AI platform as well as highlight a few key features that I truly enjoy!
Motivation for Building this AI Platform
In the early days of Landing AI, we were dedicated to creating AI applications that solve real-world problems. I had worked on using AI to detect air leakage at industrial compressors, spotting defective areas on mile-long steel sheets, finding imperfect iPhones screens and Macbook keyboards, developing autonomous harvester machines, and more. While it was a fun time to get hands dirty and solve difficult real-world problems, the amount of effort needed for each project was enormous. We repeatedly went through the same painful process:
Set up cameras and collect imaging data.
Define labeling instructions with subject matter expertise, then train labelers to annotate the data.
Benchmark multiple state-of-the-art models on the labeled data. Set up a training environment, evaluate the performance, and analyze the mispredictions from the model.
Find out the root causes of mispredictions are due to mislabels. We went back to step 2 again, and repeat this for a few rounds.
Finally, when the performance reached 99.X%, move to the deployment phase.
Optimize the model's latency and memory consumption to meet the requirement of high throughput on the edge and integrate the application into the end hardware.
After X weeks in production, data distribution starts shifting and they have to update the model. Go back to step 2 again.
As a result, most projects took at least nine months of group efforts to finish, which clearly blocked us from scaling up. To solve this problem, we identified the repeated work and started looking into possible ways to operationalize the process and building tools to assist us. For example:
We built a consensus labeling tool to cross-check labels from multiple labelers and find inconsistencies to resolve them.
We built a training pipeline to schedule and orchestrate ML training at scale.
We built a systematic process to evaluate model performance and conduct root cause analysis.
We built tools to serialize and optimize models for deployment and inference.
...
With all of these work, we made the essential building blocks to build an AI platform.
Since the founding of Landing AI, our goal has been to help more companies harness the value of AI technology. With this in mind, we naturally wanted to create a platform that would enable more people, not just the team inside Landing AI, to build successful AI applications on their own.
Therefore, Andrew Ng, the founder and CEO, and the team set the ambitious goal of building the best computer vision platform and democratizing the power of AI to empower more people.
It was not an easy journey. It took us over two years from the initial ideas and building blocks to arrive at the platform we have today. There were numerous rounds of iteration and revamping on the product and engineering sides. For a few times, we had to start from scratch and reevaluate our target customers, user personas, and experience design. We made significant changes to our core ML engine at least three times, adjusting both the framework and model architectures to achieve state-of-the-art performance. Pausing and pivoting has become a regular part of startup life for us.
Now I’m really excited that the platform is finally ready for everyone to try on and use it for building computer vision applications.
Five key features that I really love
We developed LandingLens using our previous experience building computer vision applications. Our objective was to create a platform that is fast, effective, and easy to use. I particularly love these five features:
No. 1 - Train a Model in Less Than 5 Minutes!
When I talk to others about this, they are surprised that it's possible! Conventionally, it takes hours or even days to train a deep learning model. This slows down the iteration speed: when you launch a training job with a new treatment (different configuration or label changes), you have to wait a long time to see the results. We found this to be very inefficient, so we spent a lot of effort speeding up the training process. There is no secret weapon behind the scenes —— just tireless exploration of different techniques with a clear goal of being fast.
With just 200-300 images, you can train a new model in less than 5 minutes, for a classification, object detection, or segmentation task. This significantly improves iteration speed and reduces GPU costs.
Issue No. 2 - Mislabels? Catcha!
In our past projects, approximately 70% of our efforts on model improvement were spent on improving label quality. This involved refining label definitions, identifying and correcting incorrect labels, and more. This was particularly true for long-tail use-cases where the target classes were not common objects in day-to-day life.
We explored techniques to find label inconsistencies and ultimately developed a systematic approach to identifying and fixing mislabels. This feature has been integrated into our platform as a killer feature. It automatically finds mislabels in your data and suggests labeling fixes. As a user, all you need to do is review the suggestions and accept them!
No. 3 - View Predictions and Labels Together
After training a model, it is very useful to compare its predictions to human annotations side by side. Analyzing the model's mispredictions can help contextualize its predictions and identify potential root causes.
In the past, we tried various methods for doing this, including looking at images on local machines, using notebook visualization, and building ad hoc web apps. However, these methods were cumbersome and did not scale well. Eventually, we integrated this functionality into the platform's data browser, allowing for easy review of model predictions and filtering of data subsets. This makes it simple to deep dive into the data, and potentially joyful, too!
No. 4 - Deploy to Production in Minutes
Our platform promises to support end-to-end workflow from labeling all the way to deployment. Previously, we spent a lot of effort on serializing model parameters and optimizing them to run inference on GPU and/or CPU as fast as possible. We have now integrated this process into our platform to make the whole experience seamless and simple. After training a model, you can easily deploy it and run inference in your preferred ways, from API calls to cloud endpoints to end-to-end inference pipelines running on a GPU physically located near you.
No. 5 - Data-centric Autotuning
In the past, to achieve optimal performance, we had to manually adjust many parameters. For instance, we fine-tuned the anchor box parameters based on the distribution of label dimensions, we adjusted image resizing based on image and label size, and we modified the data augmentation transformation and magnitude based on observed data attributes. After repeating these tasks many times, we developed algorithms that can auto-tune these configurations.
This process is called data-centric auto-tuning because training configurations are selected based on careful observation of the distribution and attributes inside the data, with the goal of helping users achieve optimal performance without the need for expensive hyperparameter sweeping.
This feature was online for a while and received good feedback. It is currently hidden to undergo a major upgrade to improve speed and capability. Stay tuned for its release soon!
Continuing to Build in Public
We are thrilled to release our AI platform to the public and offer everyone the opportunity to try it out for free. LandingLens was designed as a comprehensive platform that allows users to easily and quickly create and deploy computer vision applications. Our team at Landing AI invested a great deal of thought and effort into ensuring that the platform is efficient and effective, with features such as fast training times, automatic mislabel detection, and data-centric autotuning.
We are committed to continuously improving the platform, and we encourage you to try it out and provide us with your feedback and interesting use cases. It's truly exciting to think about the possibilities for AI applications in a broad range of industries!