Making Data More Affordable for AI Projects

In this special guest feature Xin Song, Co-founder and CEO of Bottos, points out that if AI is the engine that drives innovation in the next few years, then data is the fuel that powers AI. Unfortunately, both the cost and availability of data is a hurdle that many small to mid-size companies have a difficult time overcoming. Bottos is a public blockchain for AI. Xin works with the Bottos team to drive innovation in artificial intelligence, ultimately creating a new decentralized AI ecosystem. He has 13 years of experience in investment, strategy and restructuring of enterprise digitization.

If AI is the engine that drives innovation in the next few years, then data is the fuel that powers AI. Unfortunately, both the cost and availability of data is a hurdle that many small to mid-size companies have a difficult time overcoming.

According to a report from the Brookings Institute, the key to getting the most out of an AI project is having a “data-friendly ecosystem with unified standards and cross-platform sharing.” The report adds that data sets that are accessible for exploration are a prerequisite for successful AI development.

For tech giants with seemingly unlimited budgets, data acquisition isn’t an issue. That’s not the case for smaller companies, especially when the cost of data can take up as much as 50-60 percent of a project’s budget.

Scaling back on the amount of data used to train a model due to budget constraints is a poor option, as that will generally lead to an undesirable result. AI doesn’t deduce conclusions, it learns through trial and error – which takes a massive amount of data.

The lack of adequate data can lead to poorly trained models. Bias in facial recognition technology has been well-documented. A researcher at the MIT Media Lab used a data set of 1,270 faces to test three different facial recognition systems and discovered significant misidentification issues based on race and gender.

IBM recently announced that it will release two data sets in an effort help reduce or eliminate bias in facial recognition technology. One data set contains more than one million images (five times larger than the largest data set currently available), and the second has 36,000 images broken down by skin tone, gender and age.

A handful of companies has been created specifically to make data more available and accessible. There are decentralized global data marketplaces created to level the playing field for small-to-mid size companies. These blockchain-based marketplaces will bring together data providers, data requestors and data service providers. Some anticipate that these data marketplaces can reduce data costs by as much as 30 percent in some cases.

Another avenue for improving the availability of data is to build a community of people willing to contribute data, including facial images. One example is utilizing an app that uses gamification to make data contribution fun and engaging and allows contributors to earn tokens for their efforts.

Facial recognition technology is evolving at a rapid pace. Florida’s Orlando International Airport will be the first airport to implement a Biometric Entry and Exit Program, utilizing facial recognition technology for international passengers. Marriott International is partnering with Alibaba Group to test technology that would allow hotel guests to check in with a quick facial scan. Amazon is testing its “Rekognition” system with multiple law enforcement agencies.

A report from NewVantage Partners, How Big Data and AI are Driving Business Innovation, says that “AI initiatives directly benefit from access to richer, more granular, more complete, more extensive data – in vastly greater volumes, varieties and data sources.”

The report also found that a contributing factor to the growth of investment levels in big data and AI is the proliferation of data volumes and sources that are empowering AI. A survey of CEOs found that 75 percent felt that access to more and bigger data sets would help drive AI.

In order for AI to reach its full potential, the “fuel” that is data needs to become more available and affordable to small and mid-size companies.

 

Sign up for the free insideAI News newsletter.