I talked with Zotya Tóth from Prezi about data analysis, big data and different data interpretation methods and how it’s done at a successful startup like Prezi. (Note: This was my first Big Data Interview in September 2014. Stay tuned for more!)
Petabyte of data, 640 scripts, transparency…
He offered a lot of great insights. My favorite facts were:
- The company has a very strong internal transparency policy. Every employee can access all of the reports – including annual revenue, active users and everything else.
- Prezi’s internal servers utilize 640 scripts to create automated reports daily or even more frequently.
- They work with and analyze approximately 1 petabyte of data (= 1 000 terabyte = 1 000 000 gigabyte) of user interactions. They use this to understand how customers use Prezi, where they click and why.
- Self-service. Every team can access the data they need. Thanks to a good infrastructure they can access what they need easily on demand.
Tomi: How big is the Data Team now?
Zotya: We have 9 people on the team now (note: September, 2014). Inside that, we have a team of three that makes up Data Services. We are the backbone of the whole data infrastructure. On the one hand it is our responsibility to ensure that the data gets to the data warehouse, on the other hand we have to make sure that every data analyzing tool is online and working that the different teams might use. With the latter we made some very good progress – they automatically signal us if there’s any malfunction.
Our main project now is the ETL which is the spine of data transmission here. We developed a complex system in-house for this. Here at Prezi it is very important that the whole data usage stuff is self-serviced. This means, that if you are part of a product team, you can compile your own reports, know where to find the data you need, how to compile it – and we provide the platform for you. This way a team of three can serve a company with the size of more than 200 employees.
– What do you do with the data at Prezi?
– First, there is core-data that influences strategic decisions. This is very often used by the executive team to decide which direction the business should go. For example active users out of all users, revenue, churn rate or even conversion from free to paid accounts. We have a lot of data here that tracks active usage, growth and revenue.
And here, all the product teams have their own KPI (aka. Key Performance Indicator). They know what they want to achieve in a semester and they measure that. They dissect this data and analyze it on different levels. They examine very delicate details also. They often change some small detail – A/B testing – on the web. They change the location of a button and send the new version to 5% of the users. The 95% still see the original version. They want to see if the 5% uses the new layout significantly more – if it is worth to move the button. If the answer is yes, they push out the changes to the rest of the users. They run a lot of these tests parallel even with much bigger changes, but the button is a good example.
When a new feature is implemented they know what they have to measure: How many customers are using it, how many of them keep using it, what’s the error rate, how user behavior changes – which basically means we track how easy it is to make a prezi.
We have internal measurements for this and developers always check how they change every time a new function is introduced.
– How do you keep the data up to date? And how do you view it?
– We have approximately 640 scripts running every evening or even in every hour and they send their results to different visualization platforms. For example we have a tool called chart.io and we use GoodData, too. Also, we have the Plotserver, which is an in-house developed open source tool…
– This was developed by Prezi and made open source?
– Yes, anybody can use it. (link: https://github.com/prezi/plotserver)
On top of this, we also have Prezi Analytics which is handled by the Metrics team. This is a crucial element of the decision making process here so there cannot be any mistake. Information here must be timely and accurate at all times. If as a Prezi employee you need some business information or metrics, you want to know what’s happening with the company, you just use this web interface and in minutes you can click together your own report.
-And can everyone access this at the company?
– Yes, this is very important. It is the philosophy of Prezi that, if we want to be a truly data driven company, we must give access to all of our data to all of our employees. Classified data does not exist here. From the point you are hired you can see every metric retroactively, revenues, number of users, behavior of users, everything. Whatever you want, whatever you need. Simply because we think you need data to make the best decision.
– Thank you for the interview!
If you want to be notified first about new content on data36 blog (like articles, videos, handbooks, etc.), sign up for the Newsletter!