You are not logged in.

#26 2019-03-08 07:24:37

johnraff
nullglob
From: Nagoya, Japan
Registered: 2015-09-09
Posts: 5,713
Website

Re: bash script function to parse config files

johnraff wrote:

I'll have a look round some config files of apps we use like network-manager, lightdm etc and see if there's any kind of consensus.

Time to tie this up.
Examples
avahi-daemon.conf has this line:

browse-domains=0pointer.de, zeroconf.org

A space-and-comma-separated list. neutral
/etc/lightdm/users.conf has:

[UserList]
hidden-users=nobody nobody4 noaccess

Again, space-separated list.
Wavering a bit, but allowing entries like this (with no surrounding quotes) while still allowing comments, will make the parsing regex(s) less robust IMO, so at this point I think I'm going to insist that unquoted entries must be single words, with no spaces. If future developers find this too restrictive for their scripts, lets loosen it up then.

So the current functions would now look like this:

# These functions need bash.

# Usage: parse_config <file> [<default array name>]

# If no default array name is given, it defaults to 'config'.
# If there are [section] headers in file, following entries will be
#  put in array of that name.

# Config arrays may exist already and will appended to or overwritten.
# If preexisting array is not associative, function exits with error.
# New arrays will be created as needed, and remain in the environment.
parse_config(){
    [[ -f $1 ]] || { echo "$1 is not a file." >&2;return 1;}
    if [[ -n $2 ]]
    then
        local -n config_array=$2
    else
        local -n config_array=config
    fi
    declare -Ag ${!config_array} || return 1
    local line key value section_regex entry_regex
    section_regex="^[[:blank:]]*\[([[:alpha:]_][[:alnum:]_]*)\][[:blank:]]*(#.*)?$"
    entry_regex="^[[:blank:]]*([[:alpha:]_][[:alnum:]_]*)[[:blank:]]*=[[:blank:]]*('[^']+'|\"[^\"]+\"|[^#[:blank:]]+)[[:blank:]]*(#.*)*$"
    while read -r line
    do
        [[ -n $line ]] || continue
        [[ $line =~ $section_regex ]] && {
            local -n config_array=${BASH_REMATCH[1]}
            declare -Ag ${!config_array} || return 1
            continue
        }
        [[ $line =~ $entry_regex ]] || continue
        key=${BASH_REMATCH[1]}
        value=${BASH_REMATCH[2]#[\'\"]} # strip quotes
        value=${value%[\'\"]}
        config_array["${key}"]="${value}"
    done < "$1"
}

# Usage: parse_config_vars <file>
# No arrays, just read variables individually.
# Preexisting variables will be overwritten.

parse_config_vars(){
    [[ -f $1 ]] || { echo "$1 is not a file." >&2;return 1;}
    local line key value entry_regex
    entry_regex="^[[:blank:]]*([[:alpha:]_][[:alnum:]_]*)[[:blank:]]*=[[:blank:]]*('[^']+'|\"[^\"]+\"|[^#[:blank:]]+)[[:blank:]]*(#.*)*$"
    while read -r line
    do
        [[ -n $line ]] || continue
        [[ $line =~ $entry_regex ]] || continue
        key=${BASH_REMATCH[1]}
        value=${BASH_REMATCH[2]#[\'\"]} # strip quotes
        value=${value%[\'\"]}
        declare -g "${key}"="${value}"
    done < "$1"
}

OP will also be updated.


John
--------------------
( a boring Japan blog , Japan Links, idle twitterings  and GitStuff )
In case you forget, the rules.

Offline

#27 2019-03-10 19:52:49

ohnonot
...again
Registered: 2015-09-29
Posts: 3,895
Website

Re: bash script function to parse config files

nice, thanks again, I dropped it into my weather script and it seems to be working ok (first try).

btw, i had an interesting, somewhat related thread at LQ recently.

Offline

#28 2019-03-16 19:56:55

twoion
ほやほや
Registered: 2015-08-10
Posts: 2,482

Re: bash script function to parse config files

johnraff wrote:

There is a Debian package cfget but it's almost abandoned, and didn't do exactly what I wanted, eg doesn't ignore comments, no arrays.

@twoion if you ever feel like refreshing iniparser maybe we could think about packaging it? It offers a lot more functionality than my bash function, and probably would run just as fast.

John,

I have prepared a Debian package for the 'ini' tool, see https://github.com/2ion/ini. It should build without changes on Debian Buster/BL Lithium and higher. Let me know how it fares, should you decide to give it a try.


A silent kite against the blue, blue sky

Offline

#29 2019-03-17 08:57:36

johnraff
nullglob
From: Nagoya, Japan
Registered: 2015-09-09
Posts: 5,713
Website

Re: bash script function to parse config files

^Thanks!
'ini' built fine on Buster and seems to do its job.

There were a couple of Lintian messages, and I noticed that no man page was generated, although the code seems to be providing for one. Also, I wonder if the package name 'ini' might be a bit short, potentially intruding on other namespaces?

More important, though, is that as it is it doesn't seem to do exactly what I want:

Variables that don't come under a [section] are ignored.

The main point is that I'd like to have a set of default key/value pairs already set in the script calling 'ini', and possible config files under /usr/share, /etc and ~/.config parsed in turn to look for any variables to overwrite. I can't see any way to do this without a lot of shell code and multiple calls to 'ini'. Am I missing something?


John
--------------------
( a boring Japan blog , Japan Links, idle twitterings  and GitStuff )
In case you forget, the rules.

Offline

#30 2019-03-17 11:25:31

twoion
ほやほや
Registered: 2015-08-10
Posts: 2,482

Re: bash script function to parse config files

johnraff wrote:

Variables that don't come under a [section] are ignored.

That's a parser limitation. Though it is not uncommon to require INI files to have at least one section.

The main point is that I'd like to have a set of default key/value pairs already set in the script calling 'ini', and possible config files under /usr/share, /etc and ~/.config parsed in turn to look for any variables to overwrite. I can't see any way to do this without a lot of shell code and multiple calls to 'ini'. Am I missing something?

So you're looking mainly to merge INI files  with a specific order of precedence. The parser used in the command line tool at hand doesn't lend itself to the task, and the structure of the tool as it is now also doesn't. It would need more work.

A quicker way could be to implement the logic you want in a shortish Python program using their configuration (INI) parser library like so (proof-of-concept only!):

#!/usr/bin/env python3

from argparse import ArgumentParser, ArgumentDefaultsHelpFormatter
from configparser import ConfigParser, ExtendedInterpolation

if __name__ == "__main__":
    ap = ArgumentParser(formatter_class=ArgumentDefaultsHelpFormatter)
    ap.add_argument("-v", "--value", action="append", default=[],
            help="Specify a preset value in the format section.key=value")
    ap.add_argument("-P", "--print")
    ap.add_argument("ini_files", nargs='+')
    ap = ap.parse_args()

    cp = ConfigParser(inline_comment_prefixes=('#',';',),
            interpolation=ExtendedInterpolation())

    # Load configuration from the preset and all configuration files. Order of
    # specification on the command line matters!

    preset_dict = dict()
    for v in ap.value:
        section, tail = v.split('.', 1)
        key, value = tail.split('=', 1)
        sdict = preset_dict.setdefault(section, dict())
        sdict[key] = value

    cp.read_dict(preset_dict)
    cp.read(ap.ini_files, encoding="utf-8")

    # PRINT action
    #
    #   -P section.key

    if ap.print:
        sec, key = ap.print.split('.', 1)
        print(cp.get(sec, key))

This may be called like

python inilookup -v a.b=c -P a.b first.ini second.ini third.ini

which would pre-set the key in section a to be c, print a.b but also read 3 INI files each of which may extend or overwrite any previous key in any section (merge values in order). Leveraging Python's config parser would also allow for something like type validation (value must be a number, a boolean, and so on), for example if we wrote -P a.b:bool we could specify that the value of b in section a should be interpretable as a boolean flag, but also leverage interpolation. This would e.g. to also allow the referencing of values in other INI sections within the INI file, and also allow for some templating, that is perhaps set the key system.home and in another INI value you could then say config.path=${system:home}. Further , we could import all environment variables into a virtual 'ENV' section, so INI files read with the tool could use ${ENV:HOME}, ${ENV:DISPLAY}, ${ENV:EDITOR} and suchlike in their values to reference the values of these environment variables.

The issue with having keys outside of sections is that this is not allowed by default in the Python parser (but it can be made to obey) but also not in most other parsers, since if we assume that each key-value pair in an INI file belongs to a section, where does a section-less key belong to? Perhaps a virtual section like 'SECTIONLESS'. Note that the Python parser also supports a DEFAULTS section where you can place a number of default key-value pairs that are available in any other section, but may be overwritten there.

The issue with INI is that there is no specification, there are dozens of format dialects allowing this and that but perhaps not that feature present in another format.


A silent kite against the blue, blue sky

Offline

#31 2019-03-18 12:05:29

johnraff
nullglob
From: Nagoya, Japan
Registered: 2015-09-09
Posts: 5,713
Website

Re: bash script function to parse config files

This all started when I was looking at what people recommended for shell scripts' config files. The most common advice was just to source shell fragments like:

key1=val1
key2=val2

but the problem with that is that the "config" file can contain any arbitary code which the calling script will just run as-is, no questions asked. Otherwise there are a couple of big complicated parsers on GitHub. I thought it could be done more simply... roll

Anyway the two functions in the OP will fetch variables - or arrays - only, so a certain improvement in safety. I wanted something easy to use for developers and reasonably intuitive for users. (Indeed, there's no standard for ini, conf or rc files.)

It now occurs to me, though, that allowing the config file to overwrite any variable means giving it freedom to change any variable in the calling script's environment - not only those the dev had in mind. Maybe the second variables-only function parse_config_vars() should be abandoned in favour of the first one parse_config() which at least packs the values away in a named array. Even that one leaves the window open that any section can be defined in the parsed config file and an array created with that name, populated with key-value pairs. If that array already exists, the keys will be overwritten, even if it's something in the environment, not directly connected with the calling script. (eg BASH_ALIASES although that one might not be relevant to scripts.)

So the only safe way might be to ignore sections and put key/value pairs in an associative array named by the calling script?

I guess even that might be enough for a lot of purposes?


John
--------------------
( a boring Japan blog , Japan Links, idle twitterings  and GitStuff )
In case you forget, the rules.

Offline

#32 2019-03-18 12:37:23

Bearded_Blunder
Dodging A Bullet
From: Seat: seat0; vc7
Registered: 2015-09-29
Posts: 730

Re: bash script function to parse config files

Some things are tough to protect against, boils down to "don't run untrustworthy stuff", can change any variable my user has access to and run arbitrary code from openbox autostart if I put silly stuff there too.  I commend the effort, but frankly if someone's gotten bad script onto your machine the game's lost. Just put a shell script replacing arbitrary program in ~/bin & what's in it gets run every bit as unchecked as a script sourced for variables, and being earlier in path gets run before the actual intended program/script.

You could do some unfortunate things with a script named apt that typically gets called using sudo.

I've only ever really viewed this as some protection against user errors in a config.. if you're wanting to protect against potentially malicious things the user didn't put there themselves, it's probably time to concede defeat.


Blessed is he who expecteth nothing, for he shall not be disappointed...
If there's an obscure or silly way to break it, but you don't know what.. Just ask me

Offline

#33 2019-03-20 05:50:56

johnraff
nullglob
From: Nagoya, Japan
Registered: 2015-09-09
Posts: 5,713
Website

Re: bash script function to parse config files

Agreed, basically. Only thing is that these functions are intended to be libraries that any developer can pull into their script, so they ought to be as robust as possible IMO. We don't know what kind of script they might be used in, whether run as root or not... Of course the dev writing the script is supposed to be on top of that stuff...

Anyway, I've added an option to check all variables against a whitelist before overwriting them. (Only the straight-variable parse_config_vars() function so far.) It still works as before without a list, but if the name of an associative array is passed as $2, then variable names will be checked against the keys of that array and only overwritten if they are in the list.

Set up the array in the calling script something like this

declare -A checklist
for var in some_setting squidliness frobnacity
do
    checklist[$var]=1 # 1 can be any string
done

Then parse the config file(s) with:

parse_config_vars /path/to/configfile checklist

Only the three variables in checklist will be overwritten (if mentioned in configfile).

---
It doesn't take a lot of extra code in the function, this at the start:

    local check=false
    if [[ -n $2 ]]
    then
        local -n list=$2 # list is a nameref to $2
        [[ "$(declare -p ${!list} 2>/dev/null)" = 'declare -A'* ]] || { echo "$2 is not an associative array." >&2; return 1;}
        check=true
    fi

and this just before writing out the new variable value:

[[ $check = true && ${list[$key]:-X} = X ]] && continue

So the whole new function looks like this:

# Usage: parse_config_vars <file> [<checklist array name>]
# No arrays, just read variables individually.
# Preexisting variables will be overwritten.
# If name of checklist associative array is provided, then
# only variables which are keys of that array will be (over)written.

parse_config_vars(){
    local check=false
    if [[ -n $2 ]]
    then
        local -n list=$2
        [[ "$(declare -p ${!list} 2>/dev/null)" = 'declare -A'* ]] || { echo "$2 is not an associative array." >&2; return 1;}
        check=true
    fi
    [[ -f $1 ]] || { echo "$1 is not a file." >&2; return 1;}
    local line key value entry_regex
    entry_regex="^[[:blank:]]*([[:alpha:]_][[:alnum:]_]*)[[:blank:]]*=[[:blank:]]*('[^']+'|\"[^\"]+\"|[^#[:blank:]]+)[[:blank:]]*(#.*)*$"
    while read -r line
    do
        [[ -n $line ]] || continue
        [[ $line =~ $entry_regex ]] || continue
        key=${BASH_REMATCH[1]}
        [[ $check = true && ${list[$key]:-X} = X ]] && continue
        value=${BASH_REMATCH[2]#[\'\"]} # strip quotes
        value=${value%[\'\"]}
        declare -g "${key}"="${value}"
    done < "$1"
}

As I said, the checklist is optional - the function should still work as before too.

If no-one reports any issues I'll do the same with the other function for section names, and update the OP.


John
--------------------
( a boring Japan blog , Japan Links, idle twitterings  and GitStuff )
In case you forget, the rules.

Offline

Board footer

Powered by FluxBB