Applicable Technologies: IBM Big Data

Big Data Technologies

IBM offers its own range of products for working with big data. In contrast to free software, these products are designed for corporate use and have wide integration capabilities, reliable safety functions and the administration that is more convenient.

Together, these features provide for the total cost of ownership comparable to the cost of the solutions based on free software. The choice depends on each individual project and client.

IBM InfoSphere BigInsights is a Hadoop distribution program from the commercial software market leader.

The distinctive features of the distribution program are:

  • Full SQL implementation to access data
  • BigSheets is a tools similar to Excel allowing to work quickly and easily with heterogeneous data
  • Development tools based on Eclipse with various additions that reduce development cost and time
  • A single console to control all the services and security features
  • The implementation of own load optimization mechanisms helps achieving better performance compared to the free version
  • The ability to use GPFS file system, which is analogous to HDFS, that is widely used in corporate systems

IBM InfoSphere Streams is an analytical platform that allows processing incoming data on the fly: up to millions of events per second from thousands of different sources.

User-friendly interface allows building a structured chain of data processing which enables the comparison and correlation of data, its cutting and formatting, launching notifications and running other tasks, as well as real-time responses to all other signals.

IBM Watson Explorer (WE) is a product for centralized data collection and processing; it is an advanced search engine for corporate needs.

WE can collect data from any sources such as web sites, archives of documents (including PDF), databases, and web services. The data processing rules can be set for all sources, which would allow linking data from all sources and getting a big picture.

Typical system operation scenarios:

  • Organization of search. Keyword search is carried out across all data collected by the system. This system has all the advanced features of Yandex or Google to which the users are accustomed.
  • A single perspective for a certain topic. The system creates a result page with a specific set of data, for example, a page about the telecom service provider. When you select a client from the search results, the page displays data from multiple sources, such as CRM, billing, network monitoring, client’s posts within social networks, etc.

Hardware and Software Complexes

We offer IBM hardware and software solutions. These solutions usually are a preconfigured set of server hardware and software optimized for specific tasks.

Key advantages of this approach:

  • If you need to buy equipment for Big data solutions, the purchase of software and hardware complexes is usually cheaper
  • Due to a special configuration of all components, the solution has an increased performance

A list of existing solutions:

  • PureData Systems for Transactions is a solution for processing a large number of transaction requests
  • PureData Systems for Analytics uses Netezza technology and substantially accelerates the execution of analytical queries
  • PureData Systems for Operational Analytics implements the capabilities of InfoSphere Streams for streaming data processing
  • PureData Systems for Hadoop is a pre-integrated platform based on InfoSphere BigInsights

Information Management

No less important is the ability to integrate reliable IBM technologies, such as DBMS and DB2, and the solutions of Master Data Management class.

Complementary Content