Loading

Paste #p90yefjzq

  1. <a> Hello I need to search for a pattern in a file then when pattern is found print only number 10 line after pattern thank you.
  2. <a> the pattern could be found more than once.
  3. <CcxWrk> wwilliam: You can use NR variable in the condition.
  4. <CcxWrk> Like /mypattern/{print_line[NR+10]=1}
  5. <a> Thanks CcxWrk I will test.
  6. <b> Hey everyone! Hi from Canada.
  7. <CcxWrk> and then just check for print_line[NR] which, without body, will print the line
  8. <CcxWrk> Hello TRS^
  9. <b> TRS= The Rolling Stones ;) Hi CcxWrk
  10. <b> I was wondering if anyone had a few minutes to give me some of your expert guidance using awk.
  11. <CcxWrk> Ask away, either someone will have the time or not. :-)
  12. <b> Thanks
  13. <CcxWrk> TRS^: BTW "The AWK Programming Language" book is highly recommended reading if you are starting out or even moderately experienced. Written by authors of the language and very didactic.
  14. <b> I would like to pull out IP address from a series of text files that are all stored in a series of sub directories under a folder called Desktops. Each of the test files all have the same format ( It's a windows event# 4625)   Before each of the IP address I want to extract and store in it's own outfile file is the sequence 'Source Network Address: 99.155.154.135' as an example
  15. <b> Thank you CcxWrk
  16. <b> I would like to read that book for sure
  17. <CcxWrk> Well, AWK itself can't navigate the filesystem, so it will need some help from whichever shell you have around and/or the UNIX "find" program.
  18. <b> re='Source Network Address:([0-9.]*) ; while IFS= read -r line; do [[ $line =~ $re ]] || continue; ip=${BASH_REMATCH[1]}; then echo "$ip"; fi; done < /mnt/logs/logs/Desktops/*${now}.log   >> AttackersIP.txt
  19. <b> That's how far I have gotton thus far
  20. <b> sorry
  21. <b> here
  22. <b> https://gist.github.com/StephenSuley/3b51b32f54c6904898ba8ef4a89eecca
  23. <b> I define ${now} as todays date earlier in the script.
  24. <CcxWrk> So (aside from it not being AWK) what you want to do is to split it into several files based on original filename?
  25. <CcxWrk> And strip out the header before the actual IP address?
  26. <b> I would like all the IPs from all the various input files extraced and all appened into a single text file
  27. <b> so I should be in #bash?
  28. <b> sorry CcxWrk.
  29. <b> CcxWrk> And strip out the header before the actual IP address?: Yes.
  30. <b> all I would like inside the file is th e IP addresses one on each line
  31. <CcxWrk> Well, if you want help with bash… :-) I'd start with grep -o, possibly followed by sort -u
  32. <b> lol
  33. <CcxWrk> There is a way to reimplement it in AWK but in this case likely not worth it.
  34. <b> how to I use the grep command and get it to recursively check each sub directory
  35. <CcxWrk> grep pattern file1 file2 …
  36. <b> I won't know the filenames.
  37. <CcxWrk> Or find mydirectory -my-search-options -exec grep mypattern '{}' +
  38. <CcxWrk> Is recursive search needed? Since you seem to have simple glob there.
  39. <b> The background is that , it's an enviromet of 500 desktops, and each time an event # 4625 is recivied 8 times within 20 minutes a directory with the hosts IP address is created and atextfile put inside with the event log of the failed login.
  40. <CcxWrk> Grep can recursively search all files but you seem to want to match only specific filenames, hence find.
  41. <b> it's only those files I want to search there
  42. <b> anything with a .txt in all the directorys can be included.
  43. <b> or all I guess
  44. <b> doesn't need a .txt
  45. <CcxWrk> Yeah, start by learning `find`.
  46. <b> since its only those files
  47. <CcxWrk> find mydirectory -type f -iname '*.txt' -exec ...
  48. <CcxWrk> -type f is for plain files and -iname is case insensitive match
  49. <CcxWrk> Start without exec and see if it enumerates all files you want.
  50. <b> find /mnt/logs/logs/Desktops/ -type f
  51. <b> thatfound everyfile I need to consider
  52. <b> everynight at midnght all the files are cleaned out and archived.
  53. <b> so command find /mnt/logs/logs/Desktops/ -type f will wok
  54. <CcxWrk> OK then. Next step is to let find call your program to extract what you want on given filenames.
  55. <CcxWrk> Quick question: Which line ending convention do the files have? Because having that mismatched can produce "fun" results.
  56. <b> grep  grep -Eo '([0-9]+\.){3}[0-9]+' || find /mnt/logs/logs/Desktops/ -type f
  57. <b>  grep -Eo '([0-9]+\.){3}[0-9]+' || find /mnt/logs/logs/Desktops/ -type f
  58. <b> How's that?
  59. <b> FailedLogins-2020-05-18-06AM.txt
  60. <CcxWrk> The other way around. :-] Look into the -exec + form of find.
  61. <b> every text file has this format
  62. <CcxWrk> Run "file" on them. As in: file ./path/to/FailedLogins-2020-05-18-06AM.txt
  63. <CcxWrk> Do you know what || does and how it differs from | ?
  64. <b> I did. I can't recall atm
  65. <b> <> and && and ||  etc....I can't recall from memory each of them however
  66. <b> > replace >> append.
  67. <b> My memory isn't so great.
  68. <a> hello i have not been able to print anything im using this:
  69. <a> awk '/cannot/ {print_line[NR+2]=1}' log_2020_05_14_10_10_16
  70. <CcxWrk> && and || are "and" and "or" operators taken from C language. They run right-side command if the left-side succeeded or failed respectively.
  71. <a> need to print line number 2 after the pattern, I dont need to print the pattern
  72. <CcxWrk> TRS^: Write your own cheat-sheet / quick reference, that's one of best way to learn.
  73. <b> ty
  74. <b> CcxWrk: I ran thefile command but it got an error
  75. <b> */mnt/logs/logs/Desktops/172.17.153.30/FailedLogins-2020-05-18-06AM.txt: cannot open `/mnt/logs/logs/Desktops/172.17.153.30/FailedLogins-2020-05-18-06AM.txt' (No such file or directory)
  76. <CcxWrk> wwilliam: Yeah, you need both parts of what I wrote. :-) The first one only marks given lines in associative array.
  77. <CcxWrk> Or stores line numbers rather.
  78. <a> Let's see.
  79. <CcxWrk> TRS^: Something is wrong then. If you can't access the files it will be hard to extract anything out of them. :-)
  80. <b> got it
  81. <b> sorry
  82. <b> typo
  83. <b> it responded like this --- /mnt/logs/Logs/Desktops/BranchFailedLogins/172.17.153.30/FailedLogins-2020-05-18-06AM.txt: ASCII text, with CRLF line terminators
  84. <CcxWrk> Anyway I think you were looking for | which is pipeline, chaining output of one command into input of another. But it doesn't work the way you think.
  85. <a>  both parts?   awk '/cannot/ {print_line[NR+2]=1}'  is not this both parts?
  86. <CcxWrk> TRS^: Okay, so Windows (CRLF) line endings as opposed to Unix (LF) ones. So you will want to strip the CR bytes before you print it into terminal, otherwise you'll get mangled output.
  87. <CcxWrk> There is dos2unix but I recommend just using `tr -d '\r'` here.
  88. <b> ok
  89. <CcxWrk> wwilliam: <CcxWrk> and then just check for print_line[NR] which, without body, will print the line
  90. <CcxWrk> wwilliam: Are you familiar with how awk scripts are split into pattern match and body (actions to perform)?
  91. <a> I miss the above...but i dont understand.
  92. <a> A little familiar not too much im also dumb.
  93. <a> I take a long time to learn.
  94. <CcxWrk> Well, it's kinda hard to learn from reference documentation, that's why I recommend the book so much. It has actual introduction to the concepts.
  95. <a> Ok
  96. <a> thank you.
  97. <b> CcxWrk: How do I add 'Source Network Address: to the grep pattern along with  '([0-9]+\.){3}[0-9]+'
  98. <b>  this doesn't work .     grep -Eo 'Source Network Address: ([0-9]+\.){3}[0-9]+' /mnt/logs/Logs/Desktops/BranchFailedLogins/172.17.153.30/FailedLogins-2020-05-18-06AM.txt
  99. <CcxWrk> TRS^: Incorrect whitespace? Maybe tab instead of space or several spaces?
  100. <b> yeah I think it is a tab
  101. <CcxWrk> I'd start with grep '^Source Network Address:' and then perhaps strip everything up to the last whitespace on the line.
  102. <CcxWrk> \t then
  103. <b> how would that look? :)
  104. <b> This commands returns nothing......grep '^Source Network Address:' /mnt/logs/Logs/Desktops/BranchFailedLogins/172.17.153.30/FailedLogins-2020-05-18-06AM.txt
  105. <CcxWrk> wwilliam: I'm not trying to advocate piracy, but PDF is findable on the internet. Anyway, AWK is a stream processor. It takes lines, matches on them and performs actions. The AWK script is structured as such that you have pairs of: pattern{body} either of which can be missing.
  106. <CcxWrk> Pattern is usually /regular-expression/ but can be whatever expression that evaluates to true/false (nonzero or zero/empty actually)
  107. <CcxWrk> If pattern is missing the body as always performed. If body is missing it's equivalent to "print $0", that is, print whole input line.
  108. <CcxWrk> TRS^: ^ means to anchor the search at the start of line. Does it perhaps have something in front? Perhaps it's the pesky CR again.
  109. <a> OK thanks .
  110. <CcxWrk> grep -E '^\r\?Source...
  111. <b> I think there is a tab in front as well
  112. <CcxWrk> wwilliam: So what I'm doing there is two rules. One adds numbers to internal data structure on match and other checks on it on further lines.
  113. <b> .....returns nothing.....grep -Eo '\r\^?Source Network Address:' /mnt/logs/Logs/Desktops/BranchFailedLogins/172.17.153.30/FailedLogins-2020-05-18-06AM.txt
  114. <b> opps
  115. <CcxWrk> wwilliam: So it would look like awk '/match/{foo[NR+2]=1} foo[NR]'
  116. <b> ....grep -Eo '^\r\?Source Network Address:' /mnt/logs/Logs/Desktops/BranchFailedLogins/172.17.153.30/FailedLogins-2020-05-18-06AM.txt
  117. <a> Ok CcxWrk testing.
  118. <CcxWrk> wwilliam: Which internally expands to '/match/{foo[NR+2]=1} foo[NR]{print $0}' that is two condition-action pairs.
  119. <a> Ok
  120. <CcxWrk> TRS^: Interesting. Do you have xxd or perhaps hexdump available?
  121. sublim20    !faq
  122. <b> CcwWrk: Yes both are installed
  123. <CcxWrk> TRS^: Try then grep on some pattern that matches one line and pipe it to xxd and pastebin it, eg.: grep 'Source' myfile | xxd
  124. <b> ok
  125. <a> That hangs in my box CcxWrk
  126. <a> awk '/cannot/{foo[NR+2]=1} foo[NR]'axairlog_2020_05_14_10_10_16
  127. <a> never mimd
  128. <a> thank you much
  129. <a> that is what i need it
  130. <a> i missed a space after the awk command
  131. <a> Now is giving the results need it. thanks CcxWrk for your time and knowledge.
  132. <a> CcxWrk: how do i interpret the 1 in the command?
  133. <a> =1?
  134. <CcxWrk> NR is number of current "record". Which means line in most cases, unless you change the record delimiter.
  135. <b> CcxWrk: Tht worked...I ran...grep 'Address:' /mnt/logs/Logs/Desktops/BranchFailedLogins/172.17.153.30/FailedLogins-2020-05-18-06AM.txt || xxd ......it returned ........        Source Network Address: 99.254.184.135
  136. <CcxWrk> wwilliam: Data structures in AWK are weird in that when you access undefined data you don't get errror nor exception but empty string.
  137. <CcxWrk> TRS^: You are again confusing | with ||
  138. <b> sorry
  139. <a> I see CcxWrk
  140. <b> ok
  141. <b> it worked....
  142. <b> https://pastebin.com/TA0Ds8vB
  143. <CcxWrk> wwilliam: So what I do there is I make a sparse associative array. The foo[X] will return "" for any X that has been unset.
  144. <CcxWrk> wwilliam: And 1 is by convention easiest way to spell "true" in AWK (and C).
  145. <a> oh ok.
  146. <CcxWrk> Associative arrays are like dictionaries in JS or Python. Not necessarily ordered key-value pairs.
  147. <a> OK
  148. <a> Thank you for all your time and knowledge.
  149. <CcxWrk> TRS^: This is hexadecimal dump of the output. Fairly handy for seeing what is *actually* in the file. You can open `man 7 ascii` in another teminal and see what each byte corresponds to (under the Hex header).
  150. <b> CcxWrk: Got it!
  151. <b> ........grep 'Address:' /mnt/logs/Logs/Desktops/BranchFailedLogins/172.17.153.30/FailedLogins-2020-05-18-06AM.txt | awk '{print $4}'
  152. <b> outputs only the IP
  153. <CcxWrk> So you can see it starts with 09 (vertical tab) and ends with 0d 0a (Carriage Return, Line Feed)
  154. <b> Yes
  155. <b> now I'm trying to replace the static filename with the output of the find command you showed me
  156. <b> ......grep 'Address:' | find /mnt/logs/logs/Desktops/BranchFailedLogins/ -type f | awk '{print $4}'
  157. <b> can you help with that?
  158. <CcxWrk> So you probably want to do grep '^\tSource Network Address:' | sed 's/.*: //'
  159. <CcxWrk> Or rather, better and in awk:  awk -F '\t' '$2 == "Source Network Address:" {print $3}'
  160. <CcxWrk> The -F changes what AWK considers field (aka column) separator.
  161. <b> that worked as wel.
  162. <b> .......awk -F '\t' '$2 == "Source Network Address:" {print $3}' /mnt/logs/Logs/Desktops/BranchFailedLogins/172.17.153.30/FailedLogins-2020-05-18-06AM.txt
  163. <b> outp[uts only the IP address wanted
  164. <CcxWrk> So, treating it as tab-separated table checks if second column equals to your label and prints the third.
  165. <b> right.
  166. <CcxWrk> Okay, so now you can plug it to the find command.
  167. <CcxWrk> The best way is the -exec '{}' construct.
  168. <CcxWrk> http://mywiki.wooledge.org/UsingFind
  169. <b> awk -F '\t' '$2 == "Source Network Address:" {print $3}' || find /mnt/logs/logs/Desktops/BranchFailedLogins/ -type f
  170. <b> so not that
  171. <b> ;)
  172. <CcxWrk> You write your find expression and then as an action you specify to run specific command. The {} gets replaced by found files and + at the end lets it know to replace it with as many files as once as it can (the other slower option being ; for one execution per found file).
  173. <CcxWrk> || very rarely is the answer. ;-)
  174. <b> ......find /mnt/logs/logs/Desktops/BranchFailedLogins/ -type f -exec bash {awk -F '\t' '$2 == "Source Network Address:" {print $3}'}
  175. <CcxWrk> Just keep your old awk command, just prefix it with `find ... -exec `, replace filename with '{}' and suffix with '+'
  176. <CcxWrk> Beware that shell considers curly quotes special and hence the quotes around them are mandatory.
  177. <b> .......close?     find /mnt/logs/logs/Desktops/BranchFailedLogins/ -type f -exec bash -c '{awk -F '\t' '$2 == "Source Network Address:" {print $3}'}'
  178. <CcxWrk> where did you get bash -c from?
  179. <b> http://mywiki.wooledge.org/UsingFind
  180. <CcxWrk> Also ' inside ' doesn't work.
  181. <b> find ... -exec bash -c
  182. <b> .......like this ?     find /mnt/logs/logs/Desktops/BranchFailedLogins/ -type f -exec bash {awk -F '\t' '$2 == "Source Network Address:" {print $3}'}
  183. <b> error......find: missing argument to `-exec'
  184. <CcxWrk> But you don't want to process the files with bash, you want to process them with awk.
  185. <b> ahh
  186. <b> lol
  187. <b> right
  188. <b> .......like this ?     find /mnt/logs/logs/Desktops/BranchFailedLogins/ -type f -exec {awk -F '\t' '$2 == "Source Network Address:" {print $3}'}
  189. <b> .......like this ?     find /mnt/logs/logs/Desktops/BranchFailedLogins/ -type f -exec '{awk -F '\t' '$2 == "Source Network Address:" {print $3}'}'
  190. <CcxWrk> Closer but still far. Not sure if I have the time rn to explain why though. :-/ Let's try.
  191. <b> It seems to largly be a syntax issue atm
  192. <CcxWrk> It's frequently misunderstood semantic issue. Very important and infrequently explained sadly. On Unix you are executing programs by command line that is a list of strings of internally (as opposed to Windows which has specially handled contiguous string).
  193. <b> ok
  194. <b> .....I read more.....is this closer....find /mnt/logs/logs/Desktops/BranchFailedLogins/ -type f -exec awk -c 'awk -F '\t' '$2 == "Source Network Address:" {print $3}''{}'
  195. <b> I put the '{}' at the end where the file name should go
  196. <CcxWrk> Which means the command is split into arguments that can contain *any* character (aside of null byte used internally as string terminator). And you can have many different argument lists that flatten into same string, just split differently.
  197. <b> ok
  198. <b> understood.
  199. <b> I believe
  200. <b> one of the arugments can be comprised of other commands that all flatten down to the same string
  201. <b> one or all
  202. <CcxWrk> This is the core primitive that is used to run programs, namely the exec*() function family [execve, execlp, execle, many others]. But that's what programmers use, what people input is contiguous text. This is where shell comes in.
  203. <b> so we're trying to use the find command to output an argument in the awk command, as its filename argument.
  204. <b> ......seems I can't get the '{}' in the correct place......find /mnt/logs/logs/Desktops/BranchFailedLogins/ -type f -exec 'awk -F '\t' '$2 == "Source Network Address:" {print $3} '{}''
  205. <CcxWrk> Internally C programmers can use the system() function that takes a single string and calls shell with it. But that's not really important in this case. Thing is, shells are complex beasts that are designed to take sequence of characters and turn them into this list structure, often performing lot of handy substitutions and looking up of filenames.
  206. <b> A beast fr sure :) thank you
  207. <b> a beast for sure. Thank you.
  208. <CcxWrk> That's why there is so many different weird kinds of quoting: single, double, heredocs, it gets little crazy.
  209. <CcxWrk> Now, Unix programs often are made to call other programs.
  210. <b> oh?
  211. <CcxWrk> BRB, need to move, will get back to you.
  212. <CcxWrk> You can look up "unix philosophy" in the meantime. :-)
  213. <b> ......one sec.....find /mnt/logs/logs/Desktops/BranchFailedLogins/ -type f -exec 'awk -F '\t' '$2 == "Source Network Address:" {print $3}' '{}'+
  214. <b> I think thats closer.....
  215. <CcxWrk> just missing whitespace now. But the why is rather important if this is not the last script you write. ;-)
  216. <b> True
  217. <b> I will wait
  218. <b> tolearn more before simply fixingit
  219. <b> you're right
  220. <b> ......\s    is for a space...correct?
  221. <b>  I beleive I need to add a whitespace
  222. <b> I tried this....find /mnt/logs/logs/Desktops/BranchFailedLogins/ -type f -exec 'awk -F '\t' '$2 == "Source Network Address:" {print $3}' '{printf "%5s", $1}''{}'+
  223. <b> ......still not getting it....find /mnt/logs/logs/Desktops/BranchFailedLogins/ -type f -exec 'awk -F '\t' '$2 == "Source Network Address:" {print $3}' '{printf "%5s", $1}''{}'+
  224. <b> I'm trying to use the command to create the whitespace...'{printf "%5s", $1}'
  225. <CcxWrk> Back. So part of classic "Unix philosophy" lists of design rules you can usually read things like "Make each program do one thing well." and "Write programs to work together."; there are three main ways programs on Unix-like systems are designed to work together.
  226. <CcxWrk> 1) use standard I/O for reading and writing, so they can be chained together with | or redirected to/from files with > >> and <. Examples: echo, sed, grep, tr, awk, too many to count really.
  227. <CcxWrk> 2) Execute given argument list as a program. By convention programs take -e option for this where optional. Examples: xterm, nohup, xargs, and find's -exec action.
  228. <CcxWrk> 3) Execute given string as command line, by running shell (via system() call, which on most systems translates to creating new process and then exec()ing '/bin/sh' '-c' 'yourcommand'). These commands generally use -c option for this. Examples: wait, su, ssh
  229. <b> oh okay
  230. <b> -exec can be any action
  231. <b> then this is more correct......find /mnt/logs/logs/Desktops/CSOFailedLogins/ -type f -exec awk -F '\t' '$2 == "Source Network Address:" {print $3}' {} +
  232. <CcxWrk> Dealing with #3 is very tricky, because you need to be aware how shell with interpret your command. This usually means that you will need to add extra layer of quoting and/or escapes to preserve how the command line is split into argument lists. Because you are usually writing it as a shell script and then one additional shell runs *inside* it.
  233. <b> right.....so inside a script it's nested.
  234. <CcxWrk> This is source of many pitfalls and broken scripts under inputs author didn't consider, so it's important to keep in mind whether this is the case where your script is re-interpreted by another shell. This, luckily, is not the case because find's -exec is #2
  235. <CcxWrk> By the way, one feature worth knowing is the tracing mode of most shells. Enabled by `set -x` or calling sh/mksh/zsh/bash with -x argument it makes it print whichever argument list it executes, reformatted back in shell syntax with quoting/escaping where needed.
  236. <CcxWrk> This is quite invaluable to see how shell actually splits complex commands.
  237. <CcxWrk> Okay, so you are calling find. Find gets a list of strings. It processes them one by one, first it expects directories and then search conditions and actions.
  238. <CcxWrk> When it encounters argument consisting of '-exec' it starts reading arguments one by one until it reaches one of tree special arguments which need to match exactly.
  239. <CcxWrk> These are '+' ';' and '{}'. Note again, no shell is involved so you don't need to do extra escaping. But find authors decided to use symbols that are meaningful in shell syntax, precisely so they wouldn't clash with what normal programs expect for input.
  240. <b> I see.
  241. <CcxWrk> For arguments rather than for input.
  242. <b> Right.  arguments.  gotcha
  243. <CcxWrk> So '{}', that is argument consisting of two characters, opening and closing brace is replaced by found filename. This filename is passed as whole argument. This avoids any error where you would let shell interpret the filename which might contain spaces, tabs, newlines, etc.; any text can be valid filename on unix-like platforms with only / and null byte being special.
  244. <CcxWrk> This is the important part. People often try calling ls or find and processing the textual output. But that's wrong because there is no unique delimiter and given unusual enough filename their script will break.
  245. <CcxWrk> The correct approach here is to let either shell internally (via it's globbing mechanism, such as ./*.txt) or find via it's -exec action create the argument lists correctly. Since both individual program arguments and filenames are null-terminated strings they have exact mapping to each other.
  246. <b> okay...so since we know that both individual program arguments and filenames are null-terminated strings they have exact mapping to each other. That tells us we should use find
  247. <CcxWrk> So the other special arguments to '-exec' are the single characters ';' and '+'. The former is straightforward: For every matched object (whatever passes for "file" in unix, including directories, named pipes, etc.) execute the argument list as given between '-exec' and '+' with '{}' replaced by the actual filename.
  248. <CcxWrk> Err I meant ';' there. The only difference between ';' and '+' is that '+' can replace with more filenames at once.
  249. <CcxWrk> Command lines are of limited length, albeit on modern systems fairly huge. So '+' does try to replace '{}' with as many filenames, each a separate argument, and executes when it can't fit any more or when it finished searching given directories.
  250. <CcxWrk> This might someday come handy to you if you get a huge directory that's larger than what command line can fit.
  251. <CcxWrk> Say you wanted to clean up a directory with a millions of files and you try rm ./* and it fails with "command line too long". That's where you'd use find . -type f -exec rm '{}' + to let the find command issue the minimal required amount of "rm" commands to get rid of all of the files.
  252. <CcxWrk> (find also has -delete action, but shh, I'm trying to explain the principle) ;-)
  253. <CcxWrk> TRS^: Are things little clearer now?
  254. <CcxWrk> "find" is bit complicated, being it's own mini-language, but very powerful and is the only standard tool generally available for what it does.
  255. <CcxWrk> The important thing here is to learn to think with port^H^H^H^H argument lists and how they are passed between programs. People generally approach it as one opaque thing, confusing shell with programs it calls, and get very confused on what on surface seems like shell being pedantic about some weird syntax rules. But in reality it's many programs interlocked to work together and you need to fit the pieces
  256. <CcxWrk> together right.
  257. <CcxWrk> wwilliam, TRS^: would you mind if I used transcript of these conversations (sans the nicknames of course) when explaining things to others? I think it goes over some important things that have few correct on-line explanations.
  258. <a> I have no problem sure , go ahead. Thanks for asking.
  259. <CcxWrk> Perhaps one more convention to mention are command interpreters. That is programs that interpret programs in some language, usually called scripts. Awk, sed, bc, sh, bash, python and perl are among those.
  260. <CcxWrk> Problem with conventions on Unix is, many essential programs were written before the rules solidified. And even then, convenience often overruled convention, so you'll see quite a few variants here.
  261. <d> woah what's all this, are you giving a shell class CcxWrk ? :D
  262. <CcxWrk> Apparently. :P
  263. <CcxWrk> All of those I mentioned, with exception of bc, can take their script either as a *single* argument - which then will be parsed according to language's internal rules - or as a file. Some interpreters will by default take a filename of a script (such as shell, python, perl) and use -c (remember #3 above) to take it as argument instead. Some other (awk, sed) take it from argument by default and use -f to specify
  264. <CcxWrk> file with the script.
  265. <CcxWrk> This is why you write single quotes around the AWK program - it needs to be passed as a single argument, so it can split and interpret it according to AWK's internal rules - but not calling programs via the second way because that's passed directly as argument list to the operating system function to spawn another program.
  266. <CcxWrk> Actually AWK has been precisely designed to make this easy and it avoids using single quotes anywhere in it's syntax, so you can embed it into a shell program by just wrapping it in single quotes. (As long the program itself doesn't need to use single quote character for whichever reason, then it gets messier.)
  267. <CcxWrk> For most other languages it gets trickier because when writing it on shell command line you will have to escape any character that shell deems special so it stays one argument and nothing you don't want replaced won't.
  268. <CcxWrk> nmz: I'm going over one of the two things (other being filedescriptors) that no guide to shell scripting for sysadmins explains accurately. For some reason only programmer resources go into this. And I've seen many many people bash their heads in frustration because of this omission.
  269. <d> yeah, I still don't understand fds
  270. <CcxWrk> That's for another time. :D I need some dinner now.
  271. <b> Thank you very much for all your education
  272. <b> So the plus signals to loop each result in as an agrument in the command and execute and repeat until end of results
  273. <b> +
  274. <b> and ; means to process it as a single command?
  275. <CcxWrk> ; means run given command exactly once per each file found, + means each run can get more than one file (however many fit).
  276. <CcxWrk> Doesn't matter for awk so the + form is faster. But some commands can take only one file, so the distinction exists.
  277. <CcxWrk> For the record https://pubs.opengroup.org/onlinepubs/9699919799/utilities/find.html is the ultimate authoritative document on standard `find`, as dry as standard documents are though. Your implementation will likely have some extensions but this is the baseline set of features.
  278. <CcxWrk> In general https://pubs.opengroup.org/onlinepubs/9699919799/utilities/contents.html might be a page you might want to bookmark. A set of requirements for shell and it's commands any POSIX™-compliant system must provide.