A former client of mine, an enterprise architect and a guy I really respect, recommended Splunk (“not just a dirty word”) to me. They bring, more or less, a search engine approach to log file analysis. Now, this is not the sexiest thing in the world, but it’s critically important, especially in large IT shops. A large enterprise generates humongous amounts of log files; my friend said that he’s pinned a big server with just the logs from their domain controllers. And remember, these are just text files.
So the question becomes: how do you analyze all this? Traditionally, people have taken a static reporting approach, which has its place, but you need more when you have to be actively responsive. When was the last time David Hasselhof logged on? Where was he? What systems did he log onto? Did he look at Michael Jackson’s billing records?
Long ago, people thought that some kind of library-like structure was required in order to discover information on the Internet, but it turned out that brute-force searching was better. Likewise in this case, where the end goal is a Google-like interface. Now, this approach has its limitations. You have to know what you’re looking for, first of all. It doesn’t do correlations. It’s got a beautifully simple interface, but it’s not an easy UI for normal, proactive review. It’s not for canned reports. It’s not a SEM (Security Event Manager, or SIEM: Security Information and Event Manager) tool.
But for what it is, it’s great. It’s easy to look at Splunk and say, “you’re just indexing text,” but there is great power in that; look at Google. There been such a huge emphasis on auditability that we’ve generated huge files of events, but mostly they just sit there unloved. Splunk is a good way to leverage that resource.