bigdata: Bigdata

Tuesday, January 13, 2015

Bigdata

Bigdata is flexible reliable affordable web-scale computing. Big data is an all-encompassing term for any collection so large or complex that it becomes difficult to process them using traditional data processing applications.

The challenges include analysis, capture, curation, search, sharing, storage, transfer, visualization, and privacy violations. larger data sets is due to the additional information derivable from analysis of a single large set of related data, as compared to separate smaller sets with the same total amount of data, allowing correlations

to be found to "spot business trends, prevent diseases, combat crime

Toots typically used in Big data scenarios:

NoSQL DatabasesMongoDB, CouchDB, Cassandra, Redis, BigTable, Hbase, Hypertable, Voldemort, Riak, ZooKeeper

MapReduce: Hadoop, Hive, Pig, Cascading, Cascalog, mrjob, Caffeine, S4, MapR, Acunu, Flume, Kafka, Azkaban, Oozie, Greenplum

Storage: S3, Hadoop Distributed File.

Syste: EC2, Google App Engine, Elastic, Beanstalk, Heroku.

Processing: Yahoo! Pipes, Mechanical Turk, Solr/Lucene, ElasticSearch, Datameer, BigSheets, Tinkerpop

The Apache Hadoop software library is a framework that allows for the distributed processing of large data sets across clusters of computers using simple programming models

bigdata

Tuesday, January 13, 2015

Bigdata

The Apache Hadoop software library is a framework that allows for the distributed processing of large data sets across clusters of computers using simple programming models

No comments:

Post a Comment

About Me