Analytics are all but essential to software development if you wish to be responsive to your users. But in commercial apps, the problem of analytics, is trust concerns in your customer base. Famously, Microsoft introduced them years ago, and triggered a firestorm of complaints — it seemed that analytics were about as popular as Clippy. The concern, of course, was about what, exactly, Microsoft might be harvesting.
Operating without analytics is a bigger problem, as you have no data on which areas of your product are most used, least used, or even unused. Consider a large product which includes over 300 reports, each of which was developed to meet a need, or at least a perceived need. But after the initial excitement of your users, how many, and which, of these reports are actually used?
Absent any analytics data, the best you may hope is that customer support may have a sense of which reports are most popular. But that is of little or no value with respect to little used or unused reports. Each time you assign maintenance work on these reports, you must wonder how much of the work may be wasted. And who can afford to maintain unused features?
Obviously, this problem of analytics applies to much more than reports, but reports and exports are common to many products, so make a good starting point for the discussion.
Is there a safe way to add analytics and by doing so, gain positive response from the user base? I think that it can be done, but it requires careful design and even more careful presentation to the customer base. Let’s review the likely concerns.
- Privacy is essential. There is real cause for concern where these data may be of value to users’ competitors.
- Any data harvested must avoid revealing anything other than program function.
- However useful it may be to know the specific customer, that identity cannot be in the normal results.
Returning to the report example, how little may we return from a report and still get useful information? Clearly, we need the report identity. Let’s also add how many times it generated on the display and how many times it was output to PDF or printer. We will also add the date, which is useful in recognizing trends. These selections honor the criteria above, and should not cause customers any concern.
The collected data can, however, tell us much about how our product is used. We can plot usage over time, per report, and from that we can recognize the utility of the reports to the customers. We can also discover which reports may not be used at all, and from that, can better allocate developer efforts.
Although there are commercial analytics tools from major companies, such as Google and Microsoft, I would expect customers to be leery of such. Instead, I suggest accumulating the data, and then perhaps monthly, emailing it to the publisher, and providing the opportunity for the user to inspect and approve — or not — the data before sending. Also, using a simple format like CSV makes it easy for the user to be confident that nothing is hidden. Each such action should also be logged, and the log should be accessible to the user. By user, I mean a designated user for the entire server, not per user.
But I have not covered the most essential feature. Before the analytics are added, it is necessary to explain to customers clearly, and in plain language, the purpose of the collection, and the need to work together to continue to improve the product. I would be very specific in detailing what will be collected, and perhaps arrange to add analytics to distinct areas of the product, in stages. In careful language, I would also make clear that the ever increasing demands for functionality really make such collection essential, but that the approach will be adjusted to best serve customers’ concerns.
Despite the continuing concerns we all have about data harvesting, and the growing use of AI, I think that with a well reasoned and well presented approach, it is still possible to gain great benefits for developers and users.