Monday, 1 February 2021

Configuring Elastic and importing data

 Installing and Configuring Elastic

1) In conf folder set elasticsearch.yml as follows -

# ======================== Elasticsearch Configuration =========================

#

# NOTE: Elasticsearch comes with reasonable defaults for most settings.

#       Before you set out to tweak and tune the configuration, make sure you

#       understand what are you trying to accomplish and the consequences.

#

# The primary way of configuring a node is via this file. This template lists

# the most important settings you may want to configure for a production cluster.

#

# Please consult the documentation for further information on configuration options:

# https://www.elastic.co/guide/en/elasticsearch/reference/index.html

#

# ---------------------------------- Cluster -----------------------------------

#

# Use a descriptive name for your cluster:

#

#cluster.name: my-application

#

# ------------------------------------ Node ------------------------------------

#

# Use a descriptive name for the node:

#

#node.name: node-1

#

# Add custom attributes to the node:

#

#node.attr.rack: r1

#

# ----------------------------------- Paths ------------------------------------

#

# Path to directory where to store the data (separate multiple locations by comma):

#

path.data: D:\Work\echallan\elastic\data

#

# Path to log files:

#

path.logs: D:\Work\echallan\elastic\logs

#

# ----------------------------------- Memory -----------------------------------

#

# Lock the memory on startup:

#

#bootstrap.memory_lock: true

#

# Make sure that the heap size is set to about half the memory available

# on the system and that the owner of the process is allowed to use this

# limit.

#

# Elasticsearch performs poorly when the system is swapping the memory.

#

# ---------------------------------- Network -----------------------------------

#

# Set the bind address to a specific IP (IPv4 or IPv6):

#

network.host: 10.25.97.185

#

# Set a custom port for HTTP:

#

#http.port: 9200

#

# For more information, consult the network module documentation.

#

# --------------------------------- Discovery ----------------------------------

#

# Pass an initial list of hosts to perform discovery when this node is started:

# The default list of hosts is ["127.0.0.1", "[::1]"]

#

#discovery.seed_hosts: ["host1", "host2"]

discovery.type: single-node

#

# Bootstrap the cluster using an initial set of master-eligible nodes:

#

#cluster.initial_master_nodes: ["node-1", "node-2"]

#

# For more information, consult the discovery and cluster formation module documentation.

#

# ---------------------------------- Gateway -----------------------------------

#

# Block initial recovery after a full cluster restart until N nodes are started:

#

#gateway.recover_after_nodes: 3

#

# For more information, consult the gateway module documentation.

#

# ---------------------------------- Various -----------------------------------

#

# Require explicit names when deleting indices:

#

#action.destructive_requires_name: true

action.auto_create_index: ".watches,.triggered_watches,.watcher-history-*"


2) Start elasticsearch from bin folder.


Importing CSV data into elastic using logstash

1) Create index (database) using curl -X PUT "Q87H3-AM:9200/demo-csv?pretty"

2) Install logstash into C: drive.

3) In file named <name>.conf write 

input {
  file {
    path => "C:/logstash-7.10.2/bin/test.csv"
    start_position => "beginning"
mode => "read"
  }
}

filter {
  csv {
      separator => ","
      skip_header => "true"
      columns => ["id","timestamp","paymentType","name","gender","ip_address","purpose","country","age"]
  }
}

output {
   elasticsearch {
     hosts => "http://Q87H3-AM:9200"
     index => "demo-csv"
  }

stdout { codec => rubydebug }

}

3) Replace JVM_OPTS in environment variables with _JVM_OPTIONS with same values

4) Run logstash as logstash -f <name>.conf


Adjust parameters and Operating System specifics as required

5) To see created index in elastic do curl http://Q87H3-AM:9200/_cat/indices?v

6) To see the data do curl http://Q87H3-AM:9200/demo-csv/_search?pretty