Generating set of data

In: AWS

15 Nov 2017

I just released a little project in Github that provides an easy way to generate data sample and push them to something like a AWS Kinesis stream. This is quite handy if you need for example to build a POC or a demo and require some data set.

Here is the link to the github project : https://github.com/alfallouji/ALFAL-AWSBOOTCAMP-DATAGEN

Or to make it easier, you can click here to deploy it on AWS.

Please keep in mind the following. This code is provided free of charge. If you decide to deploy this on AWS (using the cloudformation script), you may incur charges related to the resources you are using in AWS (e.g. EC2, S3, Kinesis, etc.).

The structure of the generated data can be defined within a configuration file.

The following features are supported :

  • Random integer (within a min-max range)
  • Random element from a list
  • Random element from a weighted list (e.g. ‘elem1’ => 20% of chance, ‘elem2’ => 40% of chance, etc.)
  • Constant
  • Timestamp / Date
  • Counter (increment & decrement)
  • Mathematical expression using previously defined fields {{field1} + {field2} / 4) * {field3})
  • Conditional rules : {field3} equals TRUE if {field1} + {field2} < 1000 {field4} equals FALSE if {field1} + {field2} >= 1000
  • Any of the feature exposed by fzaninotto/faker library
  • Ability to defined the overall distribution (e.g I want 20% of my population to have a value of ‘Y’ for {field3}). The generator will run until it meets the desired distribution.

Here is an example of a configuration file :

// Define the desired distribution (optional)
'distribution' => array(
    // We want to have 30% of our distribution with a value of 'Y'
    // for the result field and 70% with a value of 'N'
    'result' => array(
        'Y' => 3,
        'N' => 7,
    ),
),

// Define the desired fields (mandatory)
'fields' => array(

    // You can use date function and provide the desired format
    'time' => array(
        'type' => 'date',
        'format' => 'Y-m-d H:i:s',
    ),

    // Randomly pick an integer number between 10 and 100
    'field1' => array(
        'type' => 'randomNumber',
        'randomNumber' => array(
            'max' => '100',
            'min' => '10',
        ),
    ),

    // Field2 is a constant equalts to 1000 (could be any string)
    'field2' => array(
        'type' => 'constant',
        'constant' => '1000',
    ),

    // Randomly pick an element from a defined list of values
    'field3' => array(
        'type' => 'randomList',
        'randomList' => array(
            'us',
            'europe',
            'asia',
        ),
    ),

    // Pick an element from a weighted list
    'field4' => array(
        'type' => 'weightedList',
        'weightedList' => array(
            'men' => 40,
            'women' => 60,
        ),
    ),

    // You can use mathematical expression
    'field5' => array(
        'type' => 'mathExpression',
        // Value => condition
        'mathExpression' => '{field1} + {field2} + sin({field2}) * 10',

    // You can use any of the faker feature
    'field6' => array(
        'type' => 'faker',
        'property' => 'name',
    ),

    'field7' => array(
        'type' => 'faker',
        'property' => 'email',
    ),

    'field8' => array(
        'type' => 'faker',
        'property' => 'ipv4',
    ),

    // You can define conditonnal rules to be evaluated in order to get the value
    // if this condition is true :
    // {field1} + {field2} > 1060, then the value for {result} is 'Y'
    'result' => array(
        'type' => 'rules',
        // Value => condition
        'rules' => array(
            'Y' => '{field1} + {field2} > 1060',
            'N' => '{field1} + {field2} <= 1060',
        ),
    ),
),
Be Sociable, Share!

Comment Form

Who am I?

My name is Bashar Al-Fallouji, I work as a Enterprise Solutions Architect at Amazon Web Services.

I am particularly interested in Cloud Computing, Web applications, Open Source Development, Software Engineering, Information Architecture, Unit Testing, XP/Agile development.

On this blog, you will find mostly technical articles and thoughts around PHP, OOP, OOD, Unit Testing, etc. I am also sharing a few open source tools and scripts.

  • Trinzia: Well done, my friend! [...]
  • vivek raj: Hello Bashar, It's really good that you wrote this code. but I'm confused some part. can you suppor [...]
  • irfan: I saw watch your youtube talk on clean and testable code. By the way very good talk. I was wondering [...]
  • Mohamed: Hello bashar, I hope you are doing well. Thank you for your hard work, and thank you for sharing [...]
  • alex davila: Hi Bashar is there any pick up example?? Regards Alex Davila [...]