October 16, 2018

Is software getting worse?

My question targets not only desktop software but also mobile apps, web applications and websites. Seems like software is getting worse and it won't be better. And I just have to start my rumble with the worst - mobile apps.

A very good example is Tinder. This thing needs almost 25 seconds to load and takes 50MB of space. The usability of profile page is very bad (oh, you don't need to sort your photos), application sometimes don't want to load profile pictures and transitions between pages are lagy even when the application is fully loaded in the memory. And we are talking about something, what is basically a chat (not even a good one).

You would imagine that with such high earnings, company would be able to easily craft better user experience and improve the performance. But why, when new phones with better hardware are released constantly? We are getting better hardware, but sadly the user experience is still same because the software is getting slower. After few Android updates, I can install only few applications because the internal storage can barely fit updated OS. Even though my phone is far superior than fist version of Samsung Galaxy S, the loading times are somehow slower.

Desktop software isn't better. When I tried to install Visual Studio last time, it took forever to download and install several gigabytes of data. The size of whole package can be for sure explained, but it still surprises me how can a software package take so much space. We aren't talking here about computer game, where executable is several times smaller than resources like textures, sounds, music and 3D models. It also contrasts against exceptions, which often comes from open source territory. These exceptions can offer minimal footprint and rich features packed in single executable file. Visual Studio also install several gigabytes of itself on system drive, even though you selected different hard drive. Not nice, when your system drive is SSD with limited space.

Another concerning trend is rising usage of web technologies in desktop applications. When I saw an Atom text editor for the first time, I was excited to try lightweight alternative to bloated (don't mean it in a bad way) IDEs for coding. The editor is indeed faster comparing to full fledged IDE, but it's hardly lightweight and smooth user experience. When you compare it against C++ based Sublime Text, the difference is noticeable. I'm scared of the day when web technologies will take over the desktop environment. We are slowly getting there. Sooner or later, we will start seeing advertisements in the software we daily use. Heck, Windows 10 is already doing it.

Someone can see a poor performance as a non-issue. You could even say that most desktop apps are pretty fast. But let's take a look at video games. They have to do a lot of things in one second - handle tens or thousands of entities, solve physics, process the AI, calculate object occlusions, do the path finding and many other tasks. All of that not only under one second, but most of the time under 16 milliseconds. If we use this as a standard, performance of almost any application is unacceptable.

And websites? This is where we lost battle already. We are wasting so much resources, time and bandwidth by downloading all those interactive video advertisements and animated banners. Maybe we are used to it but just visit a dev.to and imagine each website has same response times. We dreamed about faster websites when internet was a new thing. We still have same dreams now, when fast internet connection is relatively available.

As you can probably imagine, I'm not very happy with the current state of software, but there are always exceptions. I'm using several development tools with acceptable performance and some of them are even developed for free. That means creating fast and responsive software is achievable under any condition. It's all in our hands.

Our team

October 1, 2018

Mongo security

Sooner or later it is unevitable to secure your database. At least in my opinion... It was the first time I was securing a MongoDB instance and when I was looking for some information I came across this blog post. At first, if you want to set up username&password authentication for your MongoDB instance, you will find the article really helpful. Secondly, you can take much more from the post...specifically almost 600TB of data from all around the world (if you wish of course).

MongoDB security

So...I promised a remarkably big bundle of data. Good news is it is as easy as copypasting a command to your console to get it...if you have 600TBs of storage space. If you want to try this out, just read this analysis mentioned in the blog referenced above. The bad news is that probably almost none of those hundreds of whatever bytes out there are publicly accessible intentionally. This data comes from more than 30 000 completely unsecured MongoDB instances. It is natural to ask for a reason - so why is it that this big number of MongoDB instances serves data to just anyone who asks for it?

At first, it is important to note that 30 000 IS a big number. The main reason behind this global security neglection is simple - for a fairly long time, default MongoDB configuration was left completely unsecured.

Security is usually the first argument against using Mongo and it has been heavily criticised in that field (especially after a lovely global ransomware attack. But MongoDB developers are definitely not the only ones in charge for the situation - in fact, those 30 000 instances is how 'nah, it'll be OK' looks like - thousands of people just didn't want to spend a while configuring even a simple authentication mechanism. Simply the idea that your data is safe because it is not any kind of top secret information with a little bit of natural human laziness grant results (almost 600TBs of results).

After I realised how simple it is to provide the basic authentication settings I was just wondering why so many people haven't done these simple steps. In my opinion the biggest problem with data security is the perception of data - sadly not everyone sees it as something valuable nowadays...but as I would not leave alone my wallet or phone I would not do the same with data. The second important factor is our human nature - everyone has this 'nah, it'll be OK' in them. For me the most important message is that setting up at least username&password authentication for your MongoDB is a small step for a developer, but a huge leap for the security of your data.

Our team

September 3, 2018

Coalescence of Business & IT and the delayed train

I was wondering since I am in a business and since I am in an IT, why these two sectors were seen separately for such a long time. It seems last few years people started realising they can work together. Nowadays, they are many companies based simply on the digital products, look at Airbnb, Uber, Google, Spotify.. without the technology and some algorithms plus great programmers they are nothing.

Business business business
Business business business

But! Let's stop for a moment there. These companies grabs the best of IT, technology world and base their Business on a digital world. How is it with companies providing services though? For instance banks, insurance companies, city transport state companies, railway companies and many others. Are they using technology? Are they using it effectively and do they even know the possibilities? Sometimes, as we all know, it is way more about "do they want to know the possibilities of optimisation of the business or it's just all working fiiiiine for them without trying to find new ways?". I have seen many times as companies, startups or individuals are jumping for more of the technology, when they are getting in a great lost, when they are desperately trying to save the business. But shouldn't it be the other way? Technology helping to grow the business as soon as possible? Providing to people the best service and to the company a great revenue asap?

I remember some months ago I was at a Hackathon for Ostrava transport and I realized how the company works. How the people works. You see, Ostrava has one of the most optimized city transport (which I would not guess until I've seen it) and one of the most modern one (you could buy a ticket in a tram/bus by card way before you could do in Prague, which is kind of ironic as it is the main city full of tourists.^^) Why in Ostrava? Because they are people who wants to change things, because there are people trying to move the mountain doing hackathons, educating themselves about datas and new technologies. Indeed, crazy ideas (in the best meaning of the word) were presented in the hackathon as a face-recognition machine learning to say, if people are satisfied with the services or data visualisation of people using the transport. That could lead to optimized timetable of tram lines where is a very low traffic and vice versa to support transport lines with a big amount of people trying to get from a point A to a point B. And it is so simple, all the people attending the hackathon tried to help with the technology and find the best solution by their own approach. Thumbs up for that.

Wasting money
Wasting money

The same should work for the banks, the insurance companies and all great corporates. They all seem to have internal IT team. We had lately discussion with a friend of mine, if these IT teams have the same range of visions and passions for the technology and with that bringing the quality and freshness. It is indeed so hard to accept the modern technology, the growth or using open sources instead of heavy big old softwares - saving thousands of Euros just with this little change? Or even just having a different project management than the waterfall. Fortunately, there are companies, which are trying to find the way to provide better, faster and cheaper service with a smart usage of the IT, softwares, BI and other technologies.

Still, they are some others, that needs to optimize way more and maybe have different approach to things. Because maybe, if they are willing to use more of the modern technology, approaches.. I would not sit in an hour delayed train on 2,5 hours journey again and again. So, maybe let's do a hackathon to bring brainstorm ideas on how to provide a better service to the travelers.

Our team

August 24, 2018

Continuous Integration (CI)

Let's face the Continuous Integration the development practice that requires developers to integrate code into a shared repository at least once a day per developer. Each check-in is then verified by an automated build, allowing teams to detect problems early. The "continuous" has meaning of regular work, like that you can detect errors quickly, locate them more easily and remove.

Also, it is about verifying if the new code you just wrote broke or not the code that was already working, since the automated tests and other tasks (like syntax verification) are executed when integrating the code. You can't, however expect continuous Integration to get rid of bugs.

Another very important thing when talking about CI is that it needs to be supported by a suite of automated tests (not only unit tests, but also by integration tests, and even better, if possible, by end-to-end tests)

The best part is that continuous Integration is cheap. Not integrating continuously is expensive. If you don’t follow a continuous approach, you’ll have longer periods between integrations. This makes it exponentially more difficult to find and fix problems. Such integration problems can easily knock a project off-schedule, or cause it to fail altogether.

Continuous integration is composed of some essential tasks as Matthew Setter talks about it. you can look them up, but I would like to have a look at the most critical one's.

Make your build self-testing
A self-testing process is the kernel of continuous integration. The build has tests that validate the software. No matter whether you use BDD (Behavior Driven Development), TDD (Test Driven Development), or any of the other xDD’s, testing needs to be front and center in the build process.

Automate the build
Automating the build builds on the fact that it is self-testing. You have the tests in place, now make sure they are run every time. This is a natural complement to software validation.

Make it transparent
The software’s tested before it’s deployed. The deployment happens the same way every time.

Test in a clone of the production environment
This highlights a challenge that has plagued web-based applications for some time. Speaking from personal experience, whether developers develop on Linux, use OSX, or Windows, they usually host on Linux. Even when we develop on the same platform, we may not consider library versions, the existence of extensions, or the extensions’ versions, which can cause problems. So many things can go wrong after the application’s deployed.

Make it easy for anyone to get the latest executable
No matter whether it’s a senior or junior developer, whether it’s a long-term employee or someone brand new to the company, getting a working build of the latest copy of the application or service should be child’s play.

CI is not just a development practice by itself, it has also meaning for being competitive in the market,. It is very good, if you can launch new features that matters for your users faster than your competitors, so you can have advantage and better time to market. Then, CI allows you also to do another very important task from these days, that is called continuous delivery.

Our team

July 26, 2018

PDI SAS Reader


If you work in FinTech, sooner or later (but probably sooner) SAS data set will get into your project. In a field dominated by one system, there is hardly a place for rejection. Unfortunately, for small companies it is very hard to integrate SAS in their stack. In our projects, we use ETL to gather data from different sources. To be more precise, we use open source tool Pentaho Data Integration.

Even though it has some flaws, it worked very well for us so far - with one little, yet important exception. The input step isn't able to read compressed SAS datasets. After discussing different options, we opted to roll out our own solution. Starting from a scratch probably wouldn't be worth it and we, fortunately, didn't have to do that. We found Parso library - lightweight open source SAS7BDAT reader.

The API was very easy to learn. On the other hand it took significantly more time to code the plugin itself as we didn't have previous experience with making PDI plugins. Still, it took us around 2 days of pure development time to have something usable and about week of testing, fixing bugs and implementing nice to have features.

We just love to use open source software in our projects, therefore it felt right to open source PDI SAS reader too. So if you have any problem or feature idea, please visit our issue tracker on Github.

But enough with history, let's get into those meaty features we implemented. The most significant quality of life improvement is the way how columns are defined. If you don't generate the SAS data set by yourself (but maybe even if you do), it might happen that the order of columns changes.

Not surprisingly the reader step is using columns' names to identify fields in stream. This is done only with first row and indices are cached, so the lookup shouldn't have impact on performance. New columns can be added this way anywhere to data set and it won't break your transformation, which is just awesome.

The latest feature (in the time of writing this article still in snapshot) is also related to columns. At the start of the project, all columns were mandatory. If any was missing, the error was thrown. This was changed later, thus optional columns were implemented. If the column is flagged, his presence inside the file isn't checked and if missing, field in stream will have null value.

Many other checks were implemented too - missing step connections, accessibility of SAS file, file emptiness check or column presence check.

Warnings, errors and additional information about file (number of columns and rows) are shown to the user when using Verify this transformation button. The usability of the step itself was also important topic for us. The output field names can be renamed and values can be converted to desired output format.

We thought it would be a good idea to force output format to BigNumber, when using feature Get Fields (to save the mouse from some clicks), so we added the option which will suggest this format instead of Number when checked. It turned out it wasn't needed at all for our use case but you might find it useful for some projects. In fact, we would like to know your opinion about the plugin in general. Give us your input and who knows, we may implement your desired feature!

Our team