R Language

R is open source procedural language that is optimized for Data science and Statistics work.R also provide a data visualization framework.

Advantage

  • Suitable for Data science, Stats work
  • Code once - deploy anywhere
  • Opensource, Big community, Universities teach R language
  • R Limitattion

  • Hold all data in memory
  • Memory management is not efficient
  • Dynamically typed language
  • Interpreted
  • Garbage collection is poor
  • Lack parallel computation
  • -------------

    What are deployment option for Big Data solution

  • Local deployment
  • On premise cluster
  • Cloud clusters
  • Local deployment

    It is great for initial analysis where 1 person is doing the work. One can use personal computer with external hard disk and there is no additional cost to it.

    On Premise Cluster

    To operationalize a big data solution for enterprise you need a cluster. It will cost and it is not easy to scale. It will take many days if you want to increase capacity. Cost of managing this is very high too.

    Cloud Cluster

    Clould cluster can be set up in hours and there is no initial set up cost. But you pay monthly fee. It is to easy to scale if you are using PaaS solution

    -------------
    -------------
    -------------
    -------------
    -------------
    -------------
    -------------
    -------------
    -------------
    -------------
    -------------
    -------------
    -------------
    -------------
    -------------
    -------------
    -------------
    -------------
    -------------
    Home