NEC launches free "FireDucks" software for accelerating data analysis using Python- Enables up to 16 times faster data preparation, reducing time and cost of data analysis -
Tokyo, October 19, 2023 - NEC Corporation (NEC; TSE: 6701) today announced the launch of "FireDucks" (*1), a free software program designed to accelerate the table data analysis library “pandas," which is used for analysis with Python—the most widely used programming language in the world today. Capable of carrying out the data preparation required for data analysis up to 16 times (*2) faster than existing products, this newly developed software significantly shortens the time spent on data analysis and lowers computing costs.
The beta version of FireDucks is now available free of charge online (https://fireducks-dev.github.io/).
In recent years, it has become easier than ever to collect massive amounts of data, including sales data from point-of-sale (POS) terminals, e-commerce, and data from financial transactions. In order to extract valuable analytical results from such data, there is a growing need for data scientists to analyze it using artificial intelligence (AI) and machine learning (ML).
However, in order to prepare for data analysis, large data sets must first be preprocessed. Data scientists are said to spend approximately 45% (*3) of their time preparing data, and this has become a major issue. In addition, the surge in data volume and evolution of AI and ML have led to increased computational complexity. As a result, higher computational costs (e.g., cloud costs) and the consequent rise in power consumption and CO2 emissions have also become problematic.
In view of this, NEC set out to develop FireDucks, a software program designed to accelerate pandas. To develop this software, NEC leveraged the high-performance programming technology and acceleration know-how it has cultivated in its thirty-plus years of experience developing supercomputers.
By making the beta version of FireDucks available to the general public free of charge, NEC hopes to contribute to the reduction of work hours for data scientists to analyze data and the resolution of environmental issues through the conservation of power and lowering of CO2 emissions.
1. Accelerated performance
FireDucks is capable of accelerating software programs created using pandas by up to 16 times and on average by about five times (*2). This reduces the overall time data scientists spend working on data analysis by approximately 30% (*4).
Parallel utilization of all cores and computation reduction are the primary reasons for this level of acceleration. FireDucks utilizes every core of a multi-core CPU to efficiently process large data sets in parallel. Moreover, rather than executing processes in the same order and range specified in the program, the data sets necessary for producing the results are identified from the overall process in advance, which means processing only needs to be performed for those data sets. This in turn makes it possible to accelerate processing.
2. High compatibility
Another feature of this software is its high compatibility with pandas. While some libraries are able to achieve faster processing speeds than pandas, they require multiple steps, including the rewriting of the program. FireDucks, on the other hand, can be easily applied because only one line of the program must be rewritten to perform analysis and coding just as you would if using pandas.
The following results were obtained when FireDucks was used in actual operations by Toyota Technical Development Corporation (*5) (TTDC).
- 60% reduction in time spent on data analysis using an in-house AI framework (Spicy MINT)
- 76% decrease in the operating time of the analysis PC
An interview in which TTDC employees who have used FireDucks spoke with members of the development team to provide feedback on the newly developed software can be viewed on the following website. (URL: https://www.nec.com/en/global/rd/technologies/202312/index.html)
By providing the beta version of FireDucks free of charge and enabling data scientists to actually use it, NEC will work to improve its functionality while verifying its effectiveness, with the aim of commercializing it within FY2024.
- (*1)This software was developed with the support of the New Energy and Industrial Technology Development Organization (NEDO) in Japan
- (*2)According to NEC test results based on the TPCx-BB benchmark
- (*3)2020 State of Data Science
- (*4)Based on calculations performed internally by NEC
- (*5)About Toyota Technical Development Corporation (TTDC):
Focused on constructing optimum environments for product development through comprehensive solutions driven by cutting-edge information and technology
About NEC Corporation
NEC Corporation has established itself as a leader in the integration of IT and network technologies while promoting the brand statement of “Orchestrating a brighter world.” NEC enables businesses and communities to adapt to rapid changes taking place in both society and the market as it provides for the social values of safety, security, fairness and efficiency to promote a more sustainable world where everyone has the chance to reach their full potential. For more information, visit NEC at https://www.nec.com.
NEC is a registered trademark of NEC Corporation. All Rights Reserved. Other product or service marks mentioned herein are the trademarks of their respective owners. © NEC Corporation.