You are not logged in.

#1 2020-08-25 17:41:26

brontosaurusrex
Middle Office
Registered: 2015-09-29
Posts: 2,102
Website

data wrangling, replace some lines and bypass other

data.txt for testing (seconds are irrelevant)

some
12:45:00
other
14:45:00
data
is
here
00:09:00
23:02:00
and here as well

And I would like to convert time lines to H.MM format, and I came up with

#!/bin/bash

while read -r line
do

    # line by line, if it looks like time, convert, else bypass
    if (echo "$line" | grep -E '^[0-9]{2}' &> /dev/null ) ; then

        echo -n "$line > " # debug
        echo "$line" | date "+%k.%M" -f -

    else # bypass

        echo "$line < bypass"

    fi

done < data.txt

A better way, a simpler way? A magic one-line awk/sed?

Last edited by brontosaurusrex (2020-08-25 17:42:17)

Offline

#2 2020-08-25 19:55:39

ohnonot
...again
Registered: 2015-09-29
Posts: 4,882
Website

Re: data wrangling, replace some lines and bypass other

Possibly:

#!/bin/bash

while read -r line
do
    if [[ "$line" == [0-2][0-9]:[0-5][0-9]:[0-5][0-9] ]]; then
        line="${line%:*}"
        echo "${line/0/ }"
    else
         echo "$line"
    fi
done < data.txt

not tested.


BL quote proposals to this thread please.
my repos / my repos
---
Thank you for posting direct image links!

Offline

#3 2020-08-25 20:17:25

brontosaurusrex
Middle Office
Registered: 2015-09-29
Posts: 2,102
Website

Re: data wrangling, replace some lines and bypass other

^ Nice. (Except 23:02 becomes 23. 2).

Last edited by brontosaurusrex (2020-08-25 20:20:44)

Offline

#4 2020-08-25 20:22:42

Sector11
Conky 1.9er Mod Squid
From: Upstairs
Registered: 2015-08-20
Posts: 6,401

Re: data wrangling, replace some lines and bypass other

One line:

 25 Aug 20 @ 17:18:42 ~
   $ sed 's@\(..\):\(..\):\(..\)@\1:\2@' ~/time.data.txt
some
12:45
other
14:45
data
is
here
00:09
23:02 
 25 Aug 20 @ 17:19:16 ~
   $ sed 's@\(..\):\(..\):\(..\)@\1:\2@' ~/time.data.txt >~/tdata.txt

new file ~/tdata.txt

EDIT:  Not me, I'm not that smart.  Google is though.
Check #4

Last edited by Sector11 (2020-08-25 20:24:50)


The sun will never set if you keep walking towards it. - my son
Being positive doesn't understand physics.
_______________________________
Debian 10 Buster = SharpBang ♯!

Offline

#5 2020-08-26 02:10:01

johnraff
nullglob
From: Nagoya, Japan
Registered: 2015-09-09
Posts: 7,308
Website

Re: data wrangling, replace some lines and bypass other

Borrowing part of ohnonot's regex, alternative sed 1-liner:

sed -nr 's/^([0-2][0-9]:[0-5][0-9]).*$/\1/p' <<<"$data"

That will accept lines with extra stuff after the H M S part too (deleting it), which might or might not be good.


...elevator in the Brain Hotel, broken down but just as well...
( a boring Japan blog (currently paused), idle Twitterings and GitStuff )

Introduction to the Bunsenlabs Lithium Desktop

Offline

#6 2020-08-26 20:30:27

twoion
ほやほや
Registered: 2015-08-10
Posts: 2,942

Re: data wrangling, replace some lines and bypass other

Hej, the OP explicitly asked for H.MM format and NOT HH.MM or HH:MM smile

I think this is the clearest code in this thread yet smile

gawk -NF: '
  /^[0-9]+:[0-9]+:[0-9]+$/ {
    printf "%d.%02d\n", $1, $2;
  }
' data.txt

This one is more verbose but makes it easier to deal with "bypassed" lines if you want to do things with those, too, 'cause you can write just else:

gawk -NF: '
  {
    if ($0 ~ /^[0-9]+:[0-9]+:[0-9]+$/) {
      printf "%d.%02d\n", $1, $2;
    } else {
      print "Woo";
    }
  }
' data.txt

One-liners are not always best. Hard to read and slow to understand...


Per aspera ad astra.

Online

#7 2020-08-27 16:03:18

brontosaurusrex
Middle Office
Registered: 2015-09-29
Posts: 2,102
Website

Re: data wrangling, replace some lines and bypass other

@sector, john, twoion, thanks, I will test everything.

One-liners are not always best. Hard to read and slow to understand...

^ Certainly one for the remembrance.

Last edited by brontosaurusrex (2020-08-27 16:03:47)

Offline

Board footer

Powered by FluxBB