FortiSIEM
FortiSIEM provides Security Information and Event Management (SIEM) and User and Entity Behavior Analytics (UEBA)
Andy_G
Staff
Staff
Article Id 196230

Description

How is the AccelOps architecture able to process high amounts of data from many different sources and how is it able to scale to support large customers?

There are three major reasons why the AccelOps solution is able to process large amounts of data and ensure that the application is responsive:

1. Hybrid Database Model

AccelOps stores the CMDB (structured data) in a relational database while the events (unstructured time series data) in a proprietary highly indexed flat-file based database. From our past experience of having used Oracle for CS-MARS, we have observed that relational databases cannot scale to support high volume log processing (around 5K events/sec) that simultaneously require high writes (during event storage) and high reads (during queries). While handling event rates up to 1K eps is ok, anything more stresses a relational database server's CPU. Also the indexing overhead for relational databases is very high. In CS-MARS, we had 10 columns for the event table and 3 indices and we were stressing out the CPU for high event rates. Since AccelOps provides a unified view of performance, availability and security, we have to allow for 100’s of event attributes and we have to index all of them for efficiency. Of course you can break it up into many tables but then you have to do join which is expensive. That’s why we have created a flat file based database. The structure of the flat file system is carefully designed as follows:

  • Store data from different sites into different directories
  • In each directory, store files – one for disjoint time intervals
  • Index every attribute (we have currently 300+) and index every keyword in the log file
  • Compress the log files
  • As time goes by, aggregate the old event files into bigger event files

There is a data management layer that unifies the relational database storing CMDB and flat files storing events – there is an API to query the database.

2. Grid Scale-out Architecture

The flat file database allows scalability by simply adding servers. We have a tiered grid architecture where computing nodes can be added and since the data sits in NAS/SAN, the computing node can go after the data and speed up query processing. This is not possible in a relational database; making it parallel, we are talking about additional hardware cost, and the performance is not linear.

For data collection, how does one monitor 1000s of servers? Again traditional vendors will deploy agents. But we can do this in an agentless fashion. In our grid architecture, we can deploy nodes and the workload is load balanced among nodes. We can monitor all aspects of 300+ servers and 5,000 Events per Second (EPS) from one node.  

3. On-line data processing only limited by storage

In a traditional database oriented solution, once the database limit is reached, you have to purge the old data. In case you need to analyze purged data, you need then another system to load the data into the database. In our case, there is no such requirement. As long as you have the storage space, our system can look at it instantly.

 

 

Contributors