With the right financial datasets, a machine learning model might be able to predict the behavior of a given asset. That’s why the financial sector is doing everything in its power to create an effective ML model, as anything that can predict even reasonably well has the potential to generate millions of dollars. Machine learning is already predicting the behavior of citizens, which is impacting the way policy makers are doing their jobs.
At iMerit, we work every day with some of the brightest minds in the world to organize their data and build effective machine learning models. We’ve witnessed firsthand how these datasets can create an accurate predictive model for financial applications.
Economic and Financial Datasets for Machine Learning
- Quandl: One of the premier sources for financial datasets, Quandl has been used by over 250,000 analysts, asset managers, and investment banks for years. The data has consistently proven to be reliable, accurate, and useful in prediction modeling.
- EU Open Data Portal: This repository of information originally published by EU institutions and agencies in 2012 contains information ranging from the environment & employment to science & education.
- World Bank Open Data: Among financial datasets, World Bank Open Data is unique. This dataset covers population demographics throughout the world, along with a wide variety of economic and development indicators that are useful for predictive modeling.
- IMF Data: This financial dataset is brimming with information around international finances, foreign exchange reserves, commodity prices, debt rates, and investments, and has been threaded and compiled by the International Monetary Fund.
- Financial Times Market Data: This constantly updated repository features worldwide information around stock price indexes, commodities, and foreign exchange.
- Google Trends: If you’re looking to understand what the world has been Googling, then look no further than google trends. Google Trends has plenty of data including financial datasets and search trends.
- American Economic Association (AEA): The AEA features an incredible wealth of economic data and financial datasets especially around US macroeconomic data.
- School System Finances: This repository contains intimate accounting details around US schools and their finances.
- US Stock Data: This always up-to-date dataset features a forensic accounting of all US stocks since 2009.
- CBOE Volatility Index (VIX): The CBOE Volatility Index is commonly cited as a valuable source for checking market expectations of volatility in the near-term once conveyed by S&P. This time-series dataset includes the daily open, close, high and low of the CBOE Volatility Index.
- Dow Jones Weekly Returns: This financial dataset gives a daily accounting of each stock’s return percentage on a weekly basis. It’s especially useful if you’re training an algorithm to determine which stocks will have the highest returns on a given week.
- EconData: This US-government agency dataset features thousands of economic time series in different formats and media. All data is organized in a highly-efficient and easy-to-use format so it can be used by the average personal computer.
- Simfin: Featuring data and financial statements from the SEC website, this dataset has been cleaned and organized into a single document that can be downloaded and utilized within seconds.
- Saudi Arabia Public Debt: Featuring data from the Saudi Arabian Monetary Agency, this financial dataset outlines all Saudi Arabia Public Debt accrued between 2005 and 2017.
- AssetMarco: This macroeconomic database includes more than 25,000 economic indicators for more than 120 countries across the world.
- Eurostat Comext: This dataset features trade flows ranging all the way back to 1988, and has been organized by commodity.
- CIA World Factbook: This dataset focuses on the economic statistics or specific countries along with their corresponding demographics, geography, communications, and military.
- Global Financial Development: This extensive dataset features economic data regarding 214 unique economies across the world. This dataset has been continuously added upon since 1960.
- RBI Database: This platform is a repository created and managed by the Reserve Bank of India, and focuses on several aspects of the Indian economy. The dataset features information around subjects including employment, financial markets, banking, and more.
- Global Financial Data: The Global Financial Data repository is unique among financial datasets as it combines daily market data with traditional data that is then tempered with further data to create what is considered to be an extremely comprehensive historical economic and financial dataset.