Every year I try to create a small project to learn something new.
This year I decided to create a small web app that works with the imdb data.
I got the inspiration from SeriesHeat by Jim Vallandingham and while his implementation was great in its own right, two drawbacks stood out for me to be improved upon:
- The dataset was old and wasn't being updated. Therefor more recent shows wouldn't appear.
- The static hosting with sql.js was clever, but sadly very bandwidth consuming for the enduser. (Searching for Game of Thrones, would take about 30mb of data)
I therefor decided to reimplement the services in aspnetcore and try to solve these issues.
Apart from that it uses the same IMDB datasets as Jim's version.
So let's get technical. I wanted to gain experience running a ASP.NET Core application in production, which is why I chose .NET Core for this project.
Overall, it was a really plesant expierence working a with a batteries-included commercial spin-off framework. It felt like .NET Core had an answer to every question I could trow at it.
The only part that was noteworthy challenging were the batch data inserts. I've used a Postgres database and scheduled a daily job with the brilliant Hangfire framework to extract the IMDB data each night.
As of writing this, that means that means 31.013 different tv-shows with a total of 1.722.431 episodes. While hardly counting as big data, inserting every record one by one was hardly an option.
This is why I needed a batch insert, which isn't in the Entity Framework Core standard, but requires the open source "EFCore.BulkExtensions". This made it seem like I had an outlier case and showed my that the "Core" spin-off frameworks aren't as mature as I thougt. But I'm optimistic that this might change.
The next challenge was a proper search. I first took the naive approach with a simple SQL-query, which worked fine, but isn't typo resistant and feels a bit dated as a user. I then switched to Postgres built-in tsvector & tsquery solution, which is designed to offer a full-text search engine functionality. While reading the documentation it looked very promising, in the end it didn't work out for my project. Even simple queries didn't lead to the results I hoped and ignoring "filler-words" such a "of" would lead to unexpected behaviour and made it clear to me that it ultimativly wasn't the right tool for my use case.
I then switched to a tool named Sonic, which aims to be a lightweight Elasticsearch alternative. Truth be told: I 've never tried Elasticsearch before, but if a technical tool site translates itself into proper German when I visit it, someone is earning good money with it and that doesn't seem like a good fit for my side-project. 😉
So after quickly understanding the brief sonic documentatition, I gave a try and was really happy with how good it worked with almost no configuration.
So overall I really liked my work on Serieshue. It wasn't particulary challenging, but I got some really interesting tools and framework to work together and create a neat little web app.