Artificial Intelligence for Good (Pizza)
Over the course of the past week I’ve immersed myself in thousands of photos of pizza, crust, ovens, dining rooms and more… On purpose. I helped my friends at Cassano’s Pizza deploy a digital solution for field audits in December 2015 and that’s when the digital PAD (Priority Awareness Daily) process was started. We’ve accumulated over 25,000 photos across more than six thousand field inspection records. Cassano’s District Managers are expected to visit the 33 stores regularly to complete the PAD process, which includes uploading photos to the cloud. I’ve been diligently backing up photos quarterly with the idea I’d figure out something to do with them one day.
A week or so ago I was visiting my friends at Cassano’s Pizza where I fill the role of Chief Technology Officer and Digital Transformation Strategist. It was their 65th Anniversary Celebration and there was a big party where a Corvette was given away and the Kings of Chaos performed. I got in early Thursday so I could check in on projects and spend time catching up with my friends. It was a productive trip as we were connected with another friend that lives in the area to do a tour of the Cassano’s Dough Factory.
Pizza Focus AI Is Born
At the party, Chris Cassano’s oldest daughter who’d recently started working at one of the stores mentioned that no one makes the pizza how they’re supposed to, measuring the ingredients and such to make sure the proportions are right, etc. The digital PAD process has contributed to improved quality but the idea that folks weren’t doing their job how they’re supposed to didn’t surprise me. I remembered an article I’d read earlier in the day about how students at MIT had built a neural network to understand pizza and that got me thinking about how we might be able to leverage artificial intelligence to further improve quality. I’ve always been amazed at what Chris Cassano can tell someone about a pizza by simply seeing a picture.
My four year old son was trying to help with the self-checkout at Walmart a couple of months ago. He was moving around in front of me holding a package he was eager to scan when the self-checkout locked up. The attendant came over and showed us the video stating sarcastically that they had Artificial Intelligence now. What we experienced was StopLift’s Self-Checkout Loss Detection solution, which has been deployed in most Walmarts. This was my first encounter with AI out in the wild and it was interesting to say the least. Part of the process to unlock the register is for the attendant to watch the video with you and it was a bit unnerving to see a target hovering around my child.
Machine Learning in the Cloud
I explored the various cloud based machine learning options from Google, Amazon, and Microsoft before settling in on using Google Cloud AutoML Vision, which relies on Google’s state-of-the-art transfer learning and neural architecture search technology. Google makes it easy to get started for free and I’m familiar with their other APIs so off I went. I totally enjoyed the Google Codelab Tutorial that walks you through classifying images of clouds in the cloud with AutoML Vision, which is featured in the Cloud AutoML Introduction Video. I’d take the clouds in the cloud model a step further by adding multiple labels that categorize the images by shapes I see in the clouds like animals and such but there’s not a business use case for that at the moment.
As I continue down the path of increasing my human intelligence related to my understanding of artificial intelligence and its potential, I’ve realized a lot of time needs to go into strategic planning before the training and use of machine learning models can begin. You need hundreds, if not thousands of images requiring human labeling so it’s fortunate that I already have the repository of over 25,000 images from PAD. I reached out to my friend Wilder Rodrigues to see if I was on the right track before I got much further and he was enthusiastic about my ambitions validating the direction I was going. Wilder reminded me to pay attention to the free tier limits so I didn’t run up any unexpected bills.
Training A Pepperoni Pizza Model
I started sorting through the photos of pizza putting them into folders with the tops and bottoms (crusts) separate. There were plenty of pictures of pepperoni pizza so I came up with the plan to train a model using a single label dataset. I figured it would be fun to demo predicting whether a pizza had pepperoni on it or not and I already had enough photos ready to import for training. I prepared 49 photos of pepperoni pizza and 14 of cheese pizza for labeling. Adding photos to a single label dataset is straight forward and the AutoML Vision Beginner’s Guide was extremely helpful with great pointers and examples for data preparation.
The documentation suggests the minimum number of images required by AutoML Vision training is 100 image examples per category/label. The likelihood of recognizing a label goes up with the number of high quality examples for each; in general, the more labeled data you can bring to the training process, the better the model will be. Targeting at least 1000 examples per label is recommended so my dataset fell short. I admit I was anxious to see what would happen if I trained this first model with fewer images and I didn’t see the harm in doing so. The Pepperoni Pizza Training Model I built for predicting pepperoni pizza ended up with 100% precision and recall scores for predicting pepperoni and cheese pizza based on a score threshold of 0.5.
I did a bunch of tests and the model even predicted when pepperoni was one of many toppings. It surprised me on one of the test images as I didn’t see the pepperoni under the cheese and the model picked up on it predicting pepperoni. My mind was running wild and I needed to reign it in as all of the ideas around how we could use artificial intelligence for real business needs were overwhelming. We could train the model to analyze pizzas just like Chris Cassano and eventually implement automated real-time AI quality control measures across the organization. I got to the point of thinking through how one could measure ingredient portions and monitor all food being served tying it back to the inventory to reduce waste and theft.
Training A Pizza Crust Model
The next model I decided to work on was a multiple label dataset to analyze pizza crust. I sorted out 256 photos of the bottoms of pizza to prepare for labeling. There wasn’t much out there related to if or how you could apply multiple labels to an image being imported so I started dividing out the crusts that were square cut. Cassano’s Pizza is known for the signature square cut pieces that make it less messy. There were 14 photos of pizzas with hand tossed crust and 242 square cut pizza crust photos. Based on my experience with training the pepperoni pizza model, I figured I had more than enough images to move forward with training a model that could predict whether a pizza crust was square cut or hand tossed.
I added additional labels to experiment with adding multiple labels per image, which has to be done manually by clicking into each image in the user interface. It seems you can also apply labels using csv files but I need to explore how that works. I got overambitious adding more labels than I should’ve so I had to remove half in order to get the model to a point where the system would let me train, which was a minimum of 10 images per label. I’ve learned that you can tell so much about a pizza by looking at the crust. The model could be trained to predict whether the dough had been proofed properly, if it was expired, overcooked, undercooked, cut unevenly or even too moist. Artifacts and spots on the crust are indicators that the oven may need to be cleaned.
I was excited to see what the first version of the pizza crust model I was preparing to train would be able to do and I ended up with labels including artifacts, hand_tossed, moist, spots, square_cut, thin_crust, and uneven_cut. The categories with the most images were square_cut at 242, thin_crust at 24, and uneven_cut at 18. This time the precision came in at 84.8% and the recall was at 82.4% for all labels with higher precision scores for the labels with more images. All of the documentation suggests you need well over 100 images per label for a multi-label model to be more accurate. This dive into machine learning has made me release that I’m not the one that should be labeling the images as the models we train need to be based on human input from Chris Cassano.
Business Uses For Artificial Intelligence
I saw an article a few days ago about how you can make millions counting cars in parking lots from space and that got me thinking of even more ways to use machine learning and artificial intelligence. I think I’m going to work on training another model that looks at photos of pizza ovens to determine if they need to be cleaned but I’m not sure I can handle dreaming of ovens like I did pizza and crust this past week. Is there something you’re thinking about training a neural network to see and make predictions about?
Digital transformation starts with you being ready for change and understanding that your business will need to evolve. Would you like to explore how to approach the digital transformation of your business? Whether it be moving your training to an online learning management system, or using artificial intelligence for food quality control, I’d love to hear from you.