Flat files especially .csv files are very common for data loading and data
transfers. We generally come across scenarios where we want to read a .csv
file, which may or may not contain the header row. Also, there is a chance
.csv files don't have ','(comma) as delimiter rather has ''|"(pipe) as a
delimiter or any other character as a delimiter. This article will discuss
various ways to read such files in Mule 4 and create a JSON output of
it.
Mule Flow
This will be a simple flow, where we will be reading a file using a file
connector. Then convert the read data to JSON using transform message.
Finally, log the JSON payload to the console using Logger. Here, we are not
doing any complex transformation, it will be more like a MIME type conversion
from application/csv to application/json, for better understanding the
structure of input data.
Steps:
- Create a flow in your mule application.
-
Drag and drop file connector with operation "On New or Updated file" on the
source of mule flow.
-
In file connector properties under the general tab, specify or create a new
connector configuration. Provide input directory, and polling frequency(here
it is kept as 10 seconds). Specify the post-processing action like deleting
the file, moving the file, or renaming the file. As per the below
configurations, the mule application will look for a file in the inbound
folder every 10 seconds. If the file is available, then it will process and
move the file to the archive directory, deleting the file from the inbound
location.
-
Drag and drop "Transform Message" from mule palette, and have output mime
type as "application/json" as below.
-
Drag and drop logger from mule palette. Here we will be logging payload
after transformation.
- The final mule flow will look as below.
.csv file having header row in the file content
Let us take below employeeData.csv file as an example. For simplicity, we are
taking dummy data of only 5 employees.
id,first name,last name,email,gender,designation
1,Dom,Lissett,dlissett0@test.com,Male,Recruiter
2,Susan,Moogan,smoogan1@test.com,Female,Executive Secretary
3,Carl,Cotesford,lcotesford2@test.com,Male,Tech Architect
4,Debor,Sprakes,dsprakes3@test.com,Female,Account Coordinator
5,Priscella,Genny,pgenny5@test.com,Female,Environmental Tech
In this example, data in the first row signifies the headers separated by ","
as a delimiter. Rest all other rows represent employee data.
-
Since the above .csv file contains a header row, all we need to do is to
set the MIME type of file connector as "application/csv". To do so, click
on the file connector. Click on MIME Type and select "application/csv"
from the drop-down list of values.
-
Place the file as defined in the file connector. Run the application. We
will see the JSON response logged in the console as shown below.
INFO 2021-07-27 23:08:55,916 [[MuleRuntime].uber.03:
[csv-read].csv-readFlow.CPU_INTENSIVE @1e42a374] [processor:
csv-readFlow/processors/1; event: 895d1200-efb1-11eb-8134-d4d25289cb8d]
org.mule.runtime.core.internal.processor.LoggerMessageProcessor: [
{
"id": "1",
"first name": "Dom",
"last name": "Lissett",
"email": "dlissett0@test.com",
"gender": "Male",
"designation": "Recruiter"
},
{
"id": "2",
"first name": "Susan",
"last name": "Moogan",
"email": "smoogan1@test.com",
"gender": "Female",
"designation": "Executive Secretary"
},
{
"id": "3",
"first name": "Carl",
"last name": "Cotesford",
"email": "lcotesford2@test.com",
"gender": "Male",
"designation": "Tech Architect"
},
{
"id": "4",
"first name": "Debor",
"last name": "Sprakes",
"email": "dsprakes3@test.com",
"gender": "Female",
"designation": "Account Coordinator"
},
{
"id": "5",
"first name": "Priscella",
"last name": "Genny",
"email": "pgenny5@test.com",
"gender": "Female",
"designation": "Environmental Tech"
}
]
Data in the first row became the keys for the JSON output.
.csv file without header row in the file content
Let us assume there is no header row in the content of employeeData.csv
1,Dom,Lissett,dlissett0@test.com,Male,Recruiter
2,Susan,Moogan,smoogan1@test.com,Female,Executive Secretary
3,Carl,Cotesford,lcotesford2@test.com,Male,Tech Architect
4,Debor,Sprakes,dsprakes3@test.com,Female,Account Coordinator
5,Priscella,Genny,pgenny5@test.com,Female,Environmental Tech
If we try to run with the same configurations, it will take the first
row as keys and console output will look like.
INFO 2021-07-27 23:21:15,609 [[MuleRuntime].uber.03:
[csv-read].csv-readFlow.CPU_INTENSIVE @1e42a374] [processor:
csv-readFlow/processors/1; event: 4254cbd0-efb3-11eb-8134-d4d25289cb8d]
org.mule.runtime.core.internal.processor.LoggerMessageProcessor: [
{
"1": "2",
"Dom": "Susan",
"Lissett": "Moogan",
"dlissett0@test.com": "smoogan1@test.com",
"Male": "Female",
"Recruiter": "Executive Secretary"
},
{
"1": "3",
"Dom": "Carl",
"Lissett": "Cotesford",
"dlissett0@test.com": "lcotesford2@test.com",
"Male": "Male",
"Recruiter": "Tech Architect"
},
{
"1": "4",
"Dom": "Debor",
"Lissett": "Sprakes",
"dlissett0@test.com": "dsprakes3@test.com",
"Male": "Female",
"Recruiter": "Account Coordinator"
},
{
"1": "5",
"Dom": "Priscella",
"Lissett": "Genny",
"dlissett0@test.com": "pgenny5@test.com",
"Male": "Female",
"Recruiter": "Environmental Tech"
}
]
By default, after setting MIME type to "application/json", mule
considers the first row as the header row. In such cases, we don't want to
consider the first row as the header row.
-
Click on the file connector. Click on MIME Type. Click on add parameter.
Set key as "header" and value as "false". Save the changes.
Now, by using the same file, we will be able to read data properly. But in
this case, autogenerated values for keys like coloumn_0, column_1, etc.
will be populated. Below is a sample log snippet.
INFO 2021-07-27 23:35:57,158 [[MuleRuntime].uber.05:
[csv-read].csv-readFlow.CPU_INTENSIVE @6fbaa2fe] [processor:
csv-readFlow/processors/1; event: 4fbf0590-efb5-11eb-8134-d4d25289cb8d]
org.mule.runtime.core.internal.processor.LoggerMessageProcessor: [
{
"column_0": "1",
"column_1": "Dom",
"column_2": "Lissett",
"column_3": "dlissett0@test.com",
"column_4": "Male",
"column_5": "Recruiter"
},
{
"column_0": "2",
"column_1": "Susan",
"column_2": "Moogan",
"column_3": "smoogan1@test.com",
"column_4": "Female",
"column_5": "Executive Secretary"
},
{
"column_0": "3",
"column_1": "Carl",
"column_2": "Cotesford",
"column_3": "lcotesford2@test.com",
"column_4": "Male",
"column_5": "Tech Architect"
},
{
"column_0": "4",
"column_1": "Debor",
"column_2": "Sprakes",
"column_3": "dsprakes3@test.com",
"column_4": "Female",
"column_5": "Account Coordinator"
},
{
"column_0": "5",
"column_1": "Priscella",
"column_2": "Genny",
"column_3": "pgenny5@test.com",
"column_4": "Female",
"column_5": "Environmental Tech"
}
]
.csv file without header row and having a "|" as a delimiter
If we replace "," delimiter with "|" delimiter in employeeData.csv
1|Dom|Lissett|dlissett0@test.com|Male|Recruiter
2|Susan|Moogan|smoogan1@test.com|Female|Executive Secretary
3|Carl|Cotesford|lcotesford2@test.com|Male|Tech Architect
4|Debor|Sprakes|dsprakes3@test.com|Female|Account Coordinator
5|Priscella|Genny|pgenny5@test.com|Female|Environmental Tech
On running the same mule application with the above data, the output of the
above employeeData.csv file will be.
INFO 2021-07-27 23:42:06,938 [[MuleRuntime].uber.05:
[csv-read].csv-readFlow.CPU_INTENSIVE @6fbaa2fe] [processor:
csv-readFlow/processors/1; event: 2c2f0f20-efb6-11eb-8134-d4d25289cb8d]
org.mule.runtime.core.internal.processor.LoggerMessageProcessor: [
{
"column_0":
"1|Dom|Lissett|dlissett0@test.com|Male|Recruiter"
},
{
"column_0":
"2|Susan|Moogan|smoogan1@test.com|Female|Executive Secretary"
},
{
"column_0":
"3|Carl|Cotesford|lcotesford2@test.com|Male|Tech Architect"
},
{
"column_0":
"4|Debor|Sprakes|dsprakes3@test.com|Female|Account Coordinator"
},
{
"column_0":
"5|Priscella|Genny|pgenny5@test.com|Female|Environmental Tech"
}
]
Now, we are getting only one key-value pair for each row. By default mule
considers "," as delimiter for "application/csv" MIME type data. We need to
explicitly define "|" as a delimiter in parameters under the MIME type.
-
Click on the file connector. Click on MIME Type. Click on add parameter. Set
Key as "separator" and Value as "|". Save the changes.
Now, by using the same file, we will be able to read data properly as shown
below.
INFO 2021-07-27 23:51:57,557 [[MuleRuntime].uber.06:
[csv-read].csv-readFlow.CPU_INTENSIVE @6efd4175] [processor:
csv-readFlow/processors/1; event: 8c31a710-efb7-11eb-8134-d4d25289cb8d]
org.mule.runtime.core.internal.processor.LoggerMessageProcessor: [
{
"column_0": "1",
"column_1": "Dom",
"column_2": "Lissett",
"column_3": "dlissett0@test.com",
"column_4": "Male",
"column_5": "Recruiter"
},
{
"column_0": "2",
"column_1": "Susan",
"column_2": "Moogan",
"column_3": "smoogan1@test.com",
"column_4": "Female",
"column_5": "Executive Secretary"
},
{
"column_0": "3",
"column_1": "Carl",
"column_2": "Cotesford",
"column_3": "lcotesford2@test.com",
"column_4": "Male",
"column_5": "Tech Architect"
},
{
"column_0": "4",
"column_1": "Debor",
"column_2": "Sprakes",
"column_3": "dsprakes3@test.com",
"column_4": "Female",
"column_5": "Account Coordinator"
},
{
"column_0": "5",
"column_1": "Priscella",
"column_2": "Genny",
"column_3": "pgenny5@test.com",
"column_4": "Female",
"column_5": "Environmental Tech"
}
]
References:
Please share your valuable feedback 😊
This is a repost of the LinkedIn article....
Comments
Post a Comment