15 Nov 2017

I just released a little project in Github that provides an easy way to generate data sample and push them to something like a AWS Kinesis stream. This is quite handy if you need for example to build a POC or a demo and require some data set.

Here is the link to the github project :

Or to make it easier, you can click here to deploy it on AWS.

Please keep in mind the following. This code is provided free of charge. If you decide to deploy this on AWS (using the cloudformation script), you may incur charges related to the resources you are using in AWS (e.g. EC2, S3, Kinesis, etc.).

The structure of the generated data can be defined within a configuration file.

The following features are supported :

  • Random integer (within a min-max range)
  • Random element from a list
  • Random element from a weighted list (e.g. ‘elem1’ => 20% of chance, ‘elem2’ => 40% of chance, etc.)
  • Constant
  • Timestamp / Date
  • Counter (increment & decrement)
  • Mathematical expression using previously defined fields {{field1} + {field2} / 4) * {field3})
  • Conditional rules : {field3} equals TRUE if {field1} + {field2} < 1000 {field4} equals FALSE if {field1} + {field2} >= 1000
  • Any of the feature exposed by fzaninotto/faker library
  • Ability to defined the overall distribution (e.g I want 20% of my population to have a value of ‘Y’ for {field3}). The generator will run until it meets the desired distribution.

Here is an example of a configuration file :

// Define the desired distribution (optional)
'distribution' => array(
    // We want to have 30% of our distribution with a value of 'Y'
    // for the result field and 70% with a value of 'N'
    'result' => array(
        'Y' => 3,
        'N' => 7,

// Define the desired fields (mandatory)
'fields' => array(

    // You can use date function and provide the desired format
    'time' => array(
        'type' => 'date',
        'format' => 'Y-m-d H:i:s',

    // Randomly pick an integer number between 10 and 100
    'field1' => array(
        'type' => 'randomNumber',
        'randomNumber' => array(
            'max' => '100',
            'min' => '10',

    // Field2 is a constant equalts to 1000 (could be any string)
    'field2' => array(
        'type' => 'constant',
        'constant' => '1000',

    // Randomly pick an element from a defined list of values
    'field3' => array(
        'type' => 'randomList',
        'randomList' => array(

    // Pick an element from a weighted list
    'field4' => array(
        'type' => 'weightedList',
        'weightedList' => array(
            'men' => 40,
            'women' => 60,

    // You can use mathematical expression
    'field5' => array(
        'type' => 'mathExpression',
        // Value => condition
        'mathExpression' => '{field1} + {field2} + sin({field2}) * 10',

    // You can use any of the faker feature
    'field6' => array(
        'type' => 'faker',
        'property' => 'name',

    'field7' => array(
        'type' => 'faker',
        'property' => 'email',

    'field8' => array(
        'type' => 'faker',
        'property' => 'ipv4',

    // You can define conditonnal rules to be evaluated in order to get the value
    // if this condition is true :
    // {field1} + {field2} > 1060, then the value for {result} is 'Y'
    'result' => array(
        'type' => 'rules',
        // Value => condition
        'rules' => array(
            'Y' => '{field1} + {field2} > 1060',
            'N' => '{field1} + {field2} <= 1060',
