Subject: string with generic delimiters Date: Thu, 20 Apr 00 21:42:24 CST From: "Tom Zegub" Organization: Digital Highway (using Airnews.net!) Newsgroups: comp.lang.forth \ Match delimiters for string : (S) ( c1 -- c2) CASE [CHAR] < OF [CHAR] > ENDOF [CHAR] { OF [CHAR] } ENDOF [CHAR] [ OF [CHAR] ] ENDOF [CHAR] ( OF [CHAR] ) ENDOF DUP \ use same character for all others ENDCASE ; \ String with generic delimiters : S ( [char1]ccc[char2] -- addr u) BL WORD COUNT \ get next word DUP IF \ if exist MINUS >IN +! \ bump back input buffer to start of word C@ (S) \ get first character and find matching character PARSE \ parse to matching character THEN ; \ Usage examples S 'This is example of a string with "generic delimiters"' CR TYPE S "(note that almost any delimiters can be use.)" CR TYPE S CR TYPE S (be used.) CR TYPE S [This may be more than what most people will want,] CR TYPE S {but it is as versatile as can be.} CR TYPE S _There can be multiple spaces between_ CR TYPE S |the string operator and data.| CR TYPE S .Also invisible delimiters (value 255) can be. CR TYPE S +used, but I can't show it as my old news client+ CR TYPE S \doesn't like 8-bit characters\ CR TYPE -- Tom Zegub tzegub@dhc.net WasteLand http://www.dhc.net/~tzegub Subject: Re: string with generic delimiters Date: Fri, 21 Apr 2000 22:04:34 +0200 From: "Michael N. Steen" Organization: get2net Internet Kunde Newsgroups: comp.lang.forth Well, that's neat ... Could be confusing, maybe, but could be useful, too, I think. Best regards Michael Steen Tom Zegub skrev i meddelelsen ... >\ Match delimiters for string >: (S) ( c1 -- c2) > CASE > [CHAR] < OF [CHAR] > ENDOF > [CHAR] { OF [CHAR] } ENDOF > [CHAR] [ OF [CHAR] ] ENDOF > [CHAR] ( OF [CHAR] ) ENDOF > DUP \ use same character for all others > ENDCASE ; > >\ String with generic delimiters >: S ( [char1]ccc[char2] -- addr u) > BL WORD COUNT \ get next word > DUP IF \ if exist > MINUS >IN +! \ bump back input buffer to start of word > C@ (S) \ get first character and find matching character > PARSE \ parse to matching character >THEN ; > >\ Usage examples > >S 'This is example of a string with "generic delimiters"' CR TYPE >S "(note that almost any delimiters can be use.)" CR TYPE >S CR TYPE >S (be used.) CR TYPE >S [This may be more than what most people will want,] CR TYPE >S {but it is as versatile as can be.} CR TYPE >S _There can be multiple spaces between_ CR TYPE >S |the string operator and data.| CR TYPE >S .Also invisible delimiters (value 255) can be. CR TYPE >S +used, but I can't show it as my old news client+ CR TYPE >S \doesn't like 8-bit characters\ CR TYPE > >-- >Tom Zegub tzegub@dhc.net >WasteLand http://www.dhc.net/~tzegub Subject: Re: string with generic delimiters Date: Sat, 22 Apr 2000 22:48:37 +0300 From: "m_l_g3@my-deja.com" Organization: 3 ?A-ERROR ( ADDR OUT OF DIAPAZON) Newsgroups: comp.lang.forth This is what I used in RuFIG Resource Index Script (go to www.forth.org.ru and click "Resources" to see what it is). The syntax is like: $ 'ABCD' ( addr-of-counted-string ) If I needed both ( counted-addr ) and ( text-addr text-len ), I would use the names S$ and C$ . Now I use S$ and C$ as prefixes indicating the operand type(s) that the word assumes. For example, : ANS-Forth $ 'ANSI' =Author $ 'http://www.complang.tuwien.ac.at/forth/dpans-html/' =Location $ `ANS Forth Standard (last draft + RFI's, in HTML)` =Title $ 'root page (в Австрии)' =Comment $En =Language $Forth =ProgLan $URL =Type ReferenceRemote item The code is: \ ------------------------------------------------------------ see S$, : S$, DUP C, HERE SWAP DUP ALLOT MOVE EXIT ok \ ------------------------------------------------------------ \ fetch a char from the input stream, return -1 on end-of-line : CH> ( "c" -- c ) SOURCE >IN @ TUCK > >R + C@ R> 0= OR 1 >IN +! ; \ fetch a non-space from the input stream, return -1 on end-of-line : CH>> ( " c" -- c ) 0 BEGIN CH> NIP DUP BL U> UNTIL ; \ ------------------------------------------------------------ : $ ( Compilation: "ccc" -- ) ( Run-time: -- c-addr ) ?COMP CH> PARSE COMPILE (C") S$, /ALIGN ; IMMEDIATE : .$ ( Compilation: "ccc" -- ) ( Run-time: -- ) ?COMP CH> PARSE COMPILE (.") S$, /ALIGN ; IMMEDIATE \ ------------------------------------------------------------ \ ------ other definitions of this sort: --------------------- : string ( "name ccc" -- ) ( name: -- c-addr ) CREATE CH>> PARSE S$, ALIGN ; : $. ( c-addr -- ) DUP 0<> IF COUNT TYPE ELSE DROP ." " THEN ; : S$+ ( addr-to len-to addr-from len-from -- addr-to len-total ) DUP >R ( R: lf ) 2OVER + SWAP ( at lt af at+lt lf ) CMOVE ( at lt ) R> + ; : C$+ ( addr-to len-to counted-from -- ) COUNT S$+ ; : -COUNT ( addr+1 len -- addr ) SWAP 1- TUCK C! ; Subject: Re: string with generic delimiters Date: Sun, 23 Apr 2000 23:42:06 GMT From: jethomas@ix.netcom.com (Jonah Thomas) Organization: MindSpring Enterprises Newsgroups: comp.lang.forth Followup-To: comp.lang.forth wrote: >\ Usage examples >S 'This is example of a string with "generic delimiters"' CR TYPE >S "(note that almost any delimiters can be use.)" CR TYPE >S CR TYPE >S (be used.) CR TYPE >S [This may be more than what most people will want,] CR TYPE >S {but it is as versatile as can be.} CR TYPE >S _There can be multiple spaces between_ CR TYPE >S |the string operator and data.| CR TYPE >S .Also invisible delimiters (value 255) can be. CR TYPE >S +used, but I can't show it as my old news client+ CR TYPE >S \doesn't like 8-bit characters\ CR TYPE Neat! I like it. Easy to do, once you get the idea it should be done. One of those times I find myself thinking, 'why didn't I think of that?'. Subject: Re: string with generic delimiters Date: Mon, 24 Apr 00 02:40:50 CST From: "Tom Zegub" Organization: Digital Highway (using Airnews.net!) Newsgroups: comp.lang.forth MLG wrote: >This is what I used in RuFIG Resource Index Script >(go to www.forth.org.ru and click "Resources" to see what it is). > >The syntax is like: > $ 'ABCD' ( addr-of-counted-string ) > >If I needed both ( counted-addr ) and ( text-addr text-len ), >I would use the names S$ and C$ . >Now I use S$ and C$ as prefixes indicating the operand type(s) >that the word assumes. > Ok, then I will change S to S$: S$ |HAL said, "Hi Dave, how's life?| >\ ------------------------------------------------------------ >\ fetch a char from the input stream, return -1 on end-of-line >: CH> ( "c" -- c ) > SOURCE >IN @ TUCK > >R + C@ R> 0= OR 1 >IN +! >; I simplified my first pass by requiring string data to start 1 space after the string operator (nothing significant lost.) The result is close to what you have above: SOURCE >IN @ MIN + \ address of 1st character C@ (S) \ determine second delimiter 1 >IN +! \ bump past first delimiter PARSE \ parse to second delimiter At first I had a TUCK and an error check but since I wasn't going to do error exception here I just clamped the input offset to the source limit. PARSE is already setup to handle any empty parse area. The (S) is used to match delimiters, in particular open and close delimiters like "<" with ">" . It can be left out if simplification is strongly desired. For me its cost is negligible and I like having the open/close pair delimiters. I will probably keep it. Once the string is parsed it is moved to its place of residence. If compiling, SLITERAL will make it part of the word being defined. If interpreting, it is moved to a circular string buffer. I have a 256 byte buffer in high memory, %S , pre-allocated for that purpose. Given the size of string allocation needed S+ will provide the offset and wrapping as necessary. The code: \ 256 byte string buffer \ CREATE %S 256 ALLOT \ uncomment if you need it \ Next available offset in string buffer VARIABLE &S \ Get next offset in string buffer wrapping \ when buffer limit is exceeded \ u1 = length of string needed \ u2 = available offset in string buffer : S+ ( u1 -- u2) &S @ \ get next offset in string buffer 2DUP + \ add amount to allocate, u1 256 MIN \ if total exceeds buffer length clamp it there 255 AND \ see if total was within buffer range MIN \ if the limit was exceeded it will have wrapped TUCK \ tuck offset to be returned, u2 + 1+ &S ! ; \ store next available offset \ Determine string data matching delimiter : (S) ( c1 -- c2) CASE [CHAR] < OF [CHAR] > ENDOF [CHAR] { OF [CHAR] } ENDOF [CHAR] [ OF [CHAR] ] ENDOF [CHAR] ( OF [CHAR] ) ENDOF DUP \ c2=c1 for all other cases ENDCASE ; \ String operator with delimiters determined from the \ first character of the string data. \ The first character is located 1 space after the string operator. : S$ ( ccc -- addr u) SOURCE >IN @ MIN + \ address of 1st character C@ (S) \ determine second delimiter 1 >IN +! \ bump past first delimiter PARSE \ parse to second delimiter STATE @ IF ( compiling) POSTPONE SLITERAL \ include parsed string in definition ELSE ( interpreting) DUP S+ \ get next offset in string buffer %S + \ convert to address in string buffer 2DUP C! \ store count in string buffer >R \ rack string buffer address R@ 1+ SWAP CMOVE \ copy parsed string to string buffer R> \ un-rack string buffer address COUNT \ (addr u) of string in string buffer THEN ; IMMEDIATE -- Tom Zegub tzegub@dhc.net WasteLand http://www.dhc.net/~tzegub Subject: Re: string with generic delimiters Date: Tue, 25 Apr 2000 14:26:36 +0200 From: "Jos v.d.Ven" Organization: Planet Internet Newsgroups: comp.lang.forth "Tom Zegub" wrote in news:0C49FBF7D3D33677.450948907FA1B616.41DCC6FBFCC6D519@lp.airnews.net... > \ String operator with delimiters determined from the > \ first character of the string data. > \ The first character is located 1 space after the string operator. > : S$ ( ccc -- addr u) > SOURCE >IN @ MIN + \ address of 1st character > C@ (S) \ determine second delimiter > 1 >IN +! \ bump past first delimiter > PARSE \ parse to second delimiter > STATE @ IF ( compiling) > POSTPONE SLITERAL \ include parsed string in definition > ELSE ( interpreting) > DUP S+ \ get next offset in string buffer > %S + \ convert to address in string buffer > 2DUP C! \ store count in string buffer > >R \ rack string buffer address > R@ 1+ SWAP CMOVE \ copy parsed string to string buffer > R> \ un-rack string buffer address > COUNT \ (addr u) of string in string buffer > THEN ; IMMEDIATE > I changed S$ as follows: : S$ ( ccc -- addr u) SOURCE >IN @ MIN + \ address of 1st character C@ (S) \ determine second delimiter 1 >IN +! \ bump past first delimiter PARSE \ parse to second delimiter STATE @ IF ( compiling) POSTPONE SLITERAL \ include parsed string in definition THEN ; IMMEDIATE S$ |HAL said, "Hi Dave, how's life outside a definition?| cr .s type : Test2 S$ |HAL said, "Hi Dave, how's life in a definition ?| cr .s type ; test2 \s \ In Win32Forth the console shows: [2] 11008 52 HAL said, "Hi Dave, how's life outside a definition? [2] 245449 48 HAL said, "Hi Dave, how's life in a definition ? ok \ NOTE : word: S$ isn't unique \ Perhaps we need another name for S$ Jos Subject: Re: string with generic delimiters Date: Tue, 25 Apr 2000 19:48:12 +0300 From: "m_l_g3@my-deja.com" Organization: 3 ?A-ERROR ( ADDR OUT OF DIAPAZON) Newsgroups: comp.lang.forth "Jos v.d.Ven" wrote: > I changed S$ as follows: > > : S$ ( ccc -- addr u) > SOURCE >IN @ MIN + \ address of 1st character > C@ (S) \ determine second delimiter > 1 >IN +! \ bump past first delimiter > PARSE \ parse to second delimiter > STATE @ IF ( compiling) > POSTPONE SLITERAL \ include parsed string in definition > THEN *** ELSE leave (addr,len) as is *** (addr,len) is to be consubed before REFILLing the text input buffer > ; IMMEDIATE > > S$ |HAL said, "Hi Dave, how's life outside a definition?| cr .s type > : Test2 S$ |HAL said, "Hi Dave, how's life in a definition ?| cr .s type ; > > test2 > > \s > \ In Win32Forth the console shows: > [2] 11008 52 HAL said, "Hi Dave, how's life outside a definition? > [2] 245449 48 HAL said, "Hi Dave, how's life in a definition ? ok > > \ NOTE : word: S$ isn't unique > \ Perhaps we need another name for S$ > > Jos No, the existing S$ is the same word, but not understanding that '<' matches '>' etc. : ,$ ( -< #text#>- ) \ get next char from input source >in @ bl word swap >in ! 1+ c@ ( char ) \ now get a word delimited by that char word \ and add it to the dictionary count here place here c@ 1+ allot 0 c, align ; : .$ ( -< #text#>- ) POSTPONE (.") ,$ ; immediate : s$ ( -< #text#>- ) ( -- a1 n1 ) POSTPONE (s") ,$ ; immediate If I did really care about an extra word, I'd replace the original definition of S$ with this new one. I don't think someone ever used characters < { [ ( after the original S$ , so I would expect the source to load ok without any changes. To be absolutely sure, one could perform a directory scan... Subject: Re: string with generic delimiters Date: Thu, 27 Apr 00 00:59:45 CST From: "Tom Zegub" Organization: Digital Highway (using Airnews.net!) Newsgroups: comp.lang.forth On Tue, 25 Apr 2000 14:26:36 +0200, Jos v.d.Ven wrote: > >I changed S$ as follows: > >: S$ ( ccc -- addr u) > SOURCE >IN @ MIN + \ address of 1st character > C@ (S) \ determine second delimiter > 1 >IN +! \ bump past first delimiter > PARSE \ parse to second delimiter > STATE @ IF ( compiling) > POSTPONE SLITERAL \ include parsed string in definition > THEN > ; IMMEDIATE > Perhaps I can also get by without persistent strings. I will give it a try. -- Tom Zegub tzegub@dhc.net WasteLand http://www.dhc.net/~tzegub