A young leader in prediction and estimation via machine learning
"I want to make formerly impossible tasks achievable with the power of algorithms."
The application of big data is raising expectations for the predication and estimation of factors such as demand.
NEC's heterogeneous mixture learning technology realizes the previously-impossible task of making highly accurate predictions from bulky and varied data. 32-year-old Ryohei Fujimaki is the developer of one of its core algorithms. Here this young leader, who is active on the global stage as both a researcher of machine learning and a coordinator of overseas cases, discusses his commitment to data and the outlook for his work.
Making previously impractical prediction and estimation possible
--Where is the focus now for the application of big data, and what are people’s hopes for it?
Fujimaki: The application of big data, as its name suggests, involves the use of enormous volumes of data to find new value for information. This is attracting attention in a variety of fields as something that could be used to help society and business. One typical value that can be derived from information is prediction or estimation. By applying large amounts of data, prediction and estimation that was previously carried out using expert knowledge or past models is now possible with higher accuracy in a range of fields. Prediction and estimation through the application of this big data was made possible by advances in analytical techniques, including the infrastructure for gathering large amounts of data, the acceleration of data processing, and machine learning. Prediction and estimation may sound complex, but there are familiar examples that you all know well. Consider the Google Search site or the site for online retailers such as Amazon. These sites preemptively display recommended sites or products based on historical data regarding sites an individual frequently searches for or products they often buy. This is an example of prediction and estimation made possible by machine learning.
--What is important for successful prediction or estimation through the application of big data?
Fujimaki: The application of big data has a high profile in a variety of fields, but only some are actually applying it as of now. First of all, ICT vendors play a big role in determining whether prediction or estimation through the application of big data goes well. Many customers don't have the technology or know-how to analyze big data, and require the support of an ICT vendor. That means the technology and ability of the ICT vendor providing support is a crucial factor. For example, ICT vendors are required to provide a range of support to guide a customers' big data application to success, including "data-gathering capability" driven by sensor technology, "data processing technology" for processing bulky data, advanced "analytical techniques" for prediction and optimization, and "data scientist skills." Currently, the focus seems to be on the importance of data scientists as experts in analytical techniques, but along with data scientists, domain experts also play an essential role in big data application. Domain experts are professionals well-versed in the industry sector and operations of customers. They clarify and coordinate the domain, issues, and objectives, including which data should be applied in what way, and for what purpose. Close communication between talented data scientists and domain experts is another crucial point for the success of big data application.
--Tell us about NEC's strengths with regard to predictions and estimations applying big data.
Fujimaki: NEC fulfills all the criteria I mentioned above, including "data-gathering capability," "data processing technology," "analytical techniques," and "data scientist skills." That means we can provide total support for our customers' application of big data as a one-stop shop. NEC also makes use of the various accomplishments and know-how it has cultivated through providing solutions to a myriad of industry types, and we will continue to put muscle behind the development of domain experts in the future.
Currently, a range of vendors are expanding into support for the application of big data, and under these circumstances we believe it is important to create a distinct NEC flavor. We seek to promote NEC's unique strengths, such as the processing and operation of systems for handling massive amounts of data, as well as complex tasks, which are beyond the ability of entrepreneurial ventures. The heterogeneous mixture learning technology we developed has also demonstrated its superiority as a cutting-edge NEC analytical technique for the application of big data.
Machines uncover the value hidden in vast amounts of data
--Please explain heterogeneous mixture learning technology in simple terms.
Fujimaki: Okay, let me use a convenience store as an example. What kinds of things sell well on what type of days? It sounds simple, but predicting this is actually quite difficult. For example, sales of ice cream differ depending on factors such as weather, temperature, and location. When predicting product sales that fluctuate based on a number of different factors, if you are only dealing with a single product, such as ice cream, a domain expert can create a model and predict sales to a certain extent. However, when individual products number in the hundreds or thousands, a diverse range of factors affect the sales of each product, and you have to predict sales at 10,000 stores across the country, you end up with a vast number of combinations, and it is not possible to predict by conventional means. The heterogeneous mixture learning technology we developed makes this kind of prediction possible because the machine itself automatically derives the appropriate predictive formula from the diverse range of data. Heterogeneous mixture learning technology is a type of machine learning that enables machines to automatically discover frequently-occurring data or hidden regularities in various types of bulky data. The machine can also make its own combinations of conditions and factors, sifting through them to intelligently derive the appropriate prediction pattern. This makes it possible to predict the sales of a diverse range of products, such as rice balls and detergent in addition to ice cream, at various stores spread throughout Japan.
--Tell us what the objective of developing heterogeneous mixture learning technology was, as well as how it came to be developed.
Fujimaki: NEC is one of few Japanese companies that have continued research into machine learning for over 20 years. Today, we still conduct state-of-the-art research into machine learning at our Central Research Laboratories in Japan and our research facilities in the United States. Heterogeneous mixture learning technology is the culmination of all this research and hard work. Our job is to listen to customer issues, express a method for resolving them mathematically, and provide this as an algorithm. Algorithms are procedures for calculating formulas to resolve issues. Software is the realization of these algorithms. Let me discuss the developmental background of heterogeneous mixture learning technology in plainer terms. At the time, big data was gaining momentum, and because we had more and more customer issues to solve, the work we had to do was growing at an accelerating pace, so we could no longer cope. We had to come up with a way to respond quickly to a large number of customer issues in the future. We also needed to work out what was required to raise work productivity with a limited number of staff. The answer we came up with was to create a new system that took advantage of machine learning technology. We wanted to delegate some of our work to machines. We also wanted machines to automate cumbersome tasks like combining and sorting through factors required for prediction, thinking and making decisions on our behalf. This line of thinking was roughly how development of heterogeneous mixture learning technology started. In the beginning, our ideas were shot down by higher-ups who thought they couldn't be done, but we repeatedly derived formulas, performed theoretical analysis, implemented programming, and conducted tests using actual customer data to improve the algorithms and improve their accuracy. However, after developing a variety of algorithms based on the basic theory created back then, and finally becoming able to resolve a range of customer issues, we are probably even busier now despite our original intentions.
I had a dual role as researcher and overseas coordinator in North America
--Tell us about your first steps as a researcher when you joined NEC.
Fujimaki: In university my major was space technology, but I became interested in machine learning, and made that the subject of my thesis, conducting study and research in my own way. After graduating, I wanted to continue to research machine learning, but at the time there were just a few major Internet companies, and only a handful of these were researching machine learning. In the end, NEC was the first to take me on board.
After joining NEC, I was assigned to the data mining team, and I've been researching machine learning ever since. In the beginning, I was given the job of using machine learning technology for tasks such as the detection of network failures or diagnosis of the cause of vehicle failures. For about three years after I joined the company, I performed applied research that involved solving problems using data. I knuckled down and worked through the data, thinking as long as the problem was solved, any method would do. Unlike university, at a company like NEC I encountered a whole range of data, which made me very happy. It was also a big attraction to be able to actually meet a variety of customers, and hear about a range of issues first-hand. While solving a whole host of problems in this way, I reaffirmed the importance of methods based on mathematical grounds. From about my fifth year, I began to focus on the process of extracting the essence of a customer's problem, expressing this mathematically, and constructing procedures (algorithms) to resolve it. I believe this had a big impact on my subsequent stance and career as a researcher of machine learning. From May 2011, I was temporarily assigned to NEC Laboratories America in California's Silicon Valley, where I had discussions with local staff, and collaborated on the development of heterogeneous mixture learning technology. The United States is on the cutting edge of research into machine learning technology, and it is home to many trailblazing researchers, so I find working here very inspiring. It also appeals to me as a researcher that there are already many leading companies utilizing machine learning, such as Google and Facebook. My job currently involves two main roles, as I am in charge of development of the core algorithm for heterogeneous mixture learning technology, while also serving as leader of overseas cases to support problem resolution for customers outside Japan.
- *All trademarks cited here are the property of their respective owners.