Another script now. I want to parse a huge txt-file and create many small markdown-files out of it.
The example of the txt-file:
Date: 7 October 2011 at 08:00:00 EEST
Weather: 17°C Overcast
Location: Gothenburg, Sweden
![](photos/cd9fe903e2ef06dbfea4516b3665f395.jpeg)Talked with my friend for over two hours.
Date: 8 October 2011 at 13:48:00 EEST
Location: Tsehel'skoho vulytsya, Lviv, Oblast Lemberg, Ukraine
![](photos/f4724282e20b5946663fed08358c1700.jpeg)That’s me and my friend, we bought some books.
The post starts with the Date:
. It’s always a tab space and a Date:
at the beginning of each post. Sometimes there’s other info, e.g. location or weather. When there’s new Date:
, it’s another post.
I want all that metadata to become a markdown’s file front-matter. Front matter starts with +++ and ends with +++ followed by an empty line. Front matter in the new markdown file formatted with toml syntax.
If there’s — anywhere in the text, I want you to insert an empty line before and after it.
If there’s markdown-syntax image file that follows this pattern ![]()
, insert an empty line before and after it.
The result of the previous text is two files, one is $BASEDIR/2011/10-07-Fri/0800.md, which is $BASEDIR/$DATE_DIR/%HH%MM.md
, where DATE_DIR=$(date +"%Y/%m-%d-%a")
. And another file is .notes/2011/10-08-Sat/1348.md.
2011/10-07-Fri/0800.md content is:
+++
title = 'Day One Entry'
date = "2011-10-07T08:00:00+0300"
weather = "17°C Overcast"
location = "Gothenburg, Sweden"
draft = false
tags = ['diary', 'DayOne']
+++
![](photos/cd9fe903e2ef06dbfea4516b3665f395.jpeg)
Talked with my friend for over two hours.
2011/10-08-Sat/1348.md content is:
+++
title = 'Day One Entry'
date = "2011-10-08T13:48:00+0300"
location = "Tsehel’skoho vulytsya, Lviv, Oblast Lemberg, Ukraine"
draft = false
tags = ['diary', 'DayOne']
+++
![](photos/f4724282e20b5946663fed08358c1700.jpeg)
That’s me and my friend, we bought some books.
If there’s no directory, obviously you need to create it with mkdir -p
. If there’s a '
symbol convert it to either ’ or ‘ according to typography norms (if it’s inside a word or at the end of it, then it’s a ’
symbol, if it’s in the beginning of the word then it’s ‘
symbol. The file is provided as an argument to the script, and all other locations like $BASE_DIR
are specified in the script.