Database Optimization for a Travel Metasearch Engine
A travel fare aggregator company wanted to enrich their multi-supplier properties database in order to ensure a wider variety of choices for the website visitors. They decided to implement a solution which was able to map thousands of entities contained in several external sources, identify duplicates in the portfolio and combine the unique ones all together into a core database, capable of providing their customers with up-to-date accurate information and confident booking experience.
The Database Optimization
When ScaleFocus came on board, our DWH team (architects and data analysts) performed a comprehensive analysis of the source databases and data (Microsoft SQL, MS Excel, MySQL, Oracle, etc.), checking the compatibility between different attributes and entities. The challenges we faced – mainly in data inconsistency i.e. same hotels with different names, wrong names of the hotels, different addresses, different owners, etc. were handled by implementing complex mathematical and statistical methods/models. In order to perform the new logic our team used Oracle DB and MS Excel to export the data.
ScaleFocus got involved in the following areas of solution’s implementation:
- Create the comparison logic in order to identify any duplicated data.
- Develop and optimize database queries in Oracle DB (names, address, geographical coordinates, etc.).
- Do data subsetting and data cleansing.
- Perform SQL database tuning and performance optimization.
- Customized Levenstein algorithms implementation.
- Graph implementation and Depth-First-Search (DFS).
- Oracle DB
- Microsoft SQL Server
- MS Excel
- In just 2 weeks our client enriched its database with over 5000 unique properties.
- The solution overcame all data inconsistency objectives reaching 94% data correctness (60% initially).
- Enhanced confident reservation experience avoiding property matching errors and duplicates in the booking system.
- Easy administration and effective management of a complex multi-supplier portfolio database.
- Cost-efficiency due to outsourcing of time-consuming manual research and maintenance.
- The architecture of the solution saves valuable time and effort in the future maintaining and expansion of the database.
Our client is among the world’s leaders in the hospitality industry.