Thread overview
Incomplete words read from file
Nov 17, 2021
pascal111
Nov 18, 2021
jfondren
Nov 18, 2021
H. S. Teoh
Nov 18, 2021
Ali Çehreli
Nov 18, 2021
pascal111
Nov 18, 2021
jfondren
Nov 18, 2021
pascal111
November 17, 2021
I made small program that shows the content of textual files, and it succeeded to show non-English (Ascii code) language, but in many lines some words are not complete and their rests are in next lines, how can fix it?

"https://i.postimg.cc/rpP7dQYH/Screenshot-from-2021-11-18-01-40-43.png"

'''d

// D programming language

import std.stdio;
import std.string;

int main()
{

string s;
char[] f;


try{
write("Enter file name and path: ");
readln(f);
f=strip(f);}

catch(Exception err){
stderr.writefln!"Warning! %s"(err.msg);}


File file = File(f, "r");

while (!file.eof()) {
      s = chomp(file.readln());
      writeln(s);
   }

file.close();

return 0;

}


'''
November 18, 2021

On Wednesday, 17 November 2021 at 23:46:15 UTC, pascal111 wrote:

>

I made small program that shows the content of textual files, and it succeeded to show non-English (Ascii code) language, but in many lines some words are not complete and their rests are in next lines, how can fix it?

there's nothing in your program that breaks lines differently from the input. If you've a Unicode-aware terminal it should really work as it is. If 'cat Jekyll1' doesn't produce the same output as this program... then there must be some right-to-left work that needs to happen that I'm aware of.

If what you're wanting to do is to reshape text so that it prints with proper word-breaks across lines according to the current size of the terminal, then you've got to do this work yourself. On Unix a simple shortcut might be to print through fmt(1) instead:

void main() {
    import std.process : pipeShell, Redirect, wait;

    auto fmt = pipeShell("fmt", Redirect.stdin);
    scope (exit) {
        fmt.stdin.close;
        wait(fmt.pid);
    }

    char[15] longword = 'x';
    foreach (i; 1 .. 10) {
        fmt.stdin.writeln(longword);
    }
}

which outputs:

xxxxxxxxxxxxxxx xxxxxxxxxxxxxxx xxxxxxxxxxxxxxx xxxxxxxxxxxxxxx
xxxxxxxxxxxxxxx xxxxxxxxxxxxxxx xxxxxxxxxxxxxxx xxxxxxxxxxxxxxx
xxxxxxxxxxxxxxx

or... oh, there's std.string.wrap. With the same output:

void main() {
    import std.string : wrap;
    import std.stdio : write;

    enum longtext = {
        char[15] longword = 'x';
        string result;
        foreach (i; 1 .. 10)
            result ~= ' ' ~ longword;
        return result;
    }();

    write(longtext.wrap(72));
}

These tools might not do what you want with that language, though.

November 17, 2021
On 11/17/21 3:46 PM, pascal111 wrote:
> I made small program that shows the content of textual files, and it
> succeeded to show non-English (Ascii code) language, but in many lines
> some words are not complete and their rests are in next lines, how can
> fix it?

D assumes UTF-8 encoding by default. If the file is not UTF-8, std.encoding.transcode may be useful:

  https://dlang.org/library/std/encoding/transcode.html

Of course, the encoding of the file must be known. However, I think your file is already UTF-8.

> "https://i.postimg.cc/rpP7dQYH/Screenshot-from-2021-11-18-01-40-43.png"

The characterns indeed look correct. I wonder whether the lines don't fit your terminal's width and the terminal is wrapping them?

If you want to wrap the lines programmatically, there std.string.wrap:

  https://dlang.org/phobos/std_string.html#.wrap

Because we've talked about parts of your program earlier, I take liberty to comment on it. :) Then I will show an alternative version below.

> '''d
>
> // D programming language
>
> import std.stdio;
> import std.string;
>
> int main()
> {
>
> string s;

It is a general guideline that variables should be defined as close to their first use as possible. This allows for more readable, maintainable, and refactorable code.

> char[] f;

I was able to make this a string in the program below by calling the no-parameter version of readln().

>
>
> try{
> write("Enter file name and path: ");
> readln(f);
> f=strip(f);}

Because errors can occur in other parts of the program as well, you can wrap the whole code in a try block.

>
> catch(Exception err){
> stderr.writefln!"Warning! %s"(err.msg);}

It is better to either return with a non-zero error code here or do something about the error. For example:

  writeln("Using the default file.")
  f = "my_default_file".dup;

But I think it is better to return 1 there.

>
>
> File file = File(f, "r");
>
> while (!file.eof()) {

There is byLine (and byLineCopy) that produce a file line-by-line, which you can use here as well.

>        s = chomp(file.readln());
>        writeln(s);
>     }
>
> file.close();

Although harmless, you don't need to call File.close because it is already called by the destructor of File.

>
> return 0;
>
> }
>
>
> '''

Here is an alternative:

import std.stdio;
import std.string;

int main() {
  try {
    printFileLines();

  } catch(Exception err){
    stderr.writefln!"Warning! %s"(err.msg);
    return 1;
  }

  return 0;
}

void printFileLines() {
  write("Enter file name and path: ");
  string f = strip(readln());

  File file = File(f, "r");

  foreach (line; file.byLine) {
    const s = chomp(line);
    writeln(s);
  }
}

Ali

November 17, 2021
On Thu, Nov 18, 2021 at 12:39:12AM +0000, jfondren via Digitalmars-d-learn wrote: [...]
> If what you're wanting to do is to *reshape* text so that it prints with proper word-breaks across lines according to the current size of the terminal, then you've got to do this work yourself.
[...]

Just to chip in: line-breaking in Unicode is, in general, non-trivial, because it changes depending on language, left-to-right / right-to-left settings, font properties, and display environment.  If this is what you want to do, the `linebreak` dub package may be a good starting point (it implements the Unicode line-breaking algorithm in Annex 14):

	https://code.dlang.org/packages/linebreak/1.1.2

Note that this algorithm only gives you linebreak opportunities; you still have to figure out yourself where among these opportunities to actually insert a linebreak.  For this you will need to measure how long each text segment is.  In general, this also depends on your font, font size, and font properties.  If you're outputting to the terminal, this is somewhat simpler (most graphemes are 1 column wide) but you still have to take into account double-width and zero-width characters (and also how your terminal actually displays such characters -- not all terminals will display double-width characters as double width, though most will).


T

-- 
MAS = Mana Ada Sistem?
November 18, 2021
On Thursday, 18 November 2021 at 00:42:49 UTC, Ali Çehreli wrote:
> On 11/17/21 3:46 PM, pascal111 wrote:
> > I made small program that shows the content of textual files,
> and it
> > succeeded to show non-English (Ascii code) language, but in
> many lines
> > some words are not complete and their rests are in next
> lines, how can
> > fix it?
>
> D assumes UTF-8 encoding by default. If the file is not UTF-8, std.encoding.transcode may be useful:
>
>   https://dlang.org/library/std/encoding/transcode.html
>
> Of course, the encoding of the file must be known. However, I think your file is already UTF-8.
>
> > 
> "https://i.postimg.cc/rpP7dQYH/Screenshot-from-2021-11-18-01-40-43.png"
>
> The characterns indeed look correct. I wonder whether the lines don't fit your terminal's width and the terminal is wrapping them?
>
> If you want to wrap the lines programmatically, there std.string.wrap:
>
>   https://dlang.org/phobos/std_string.html#.wrap
>
> Because we've talked about parts of your program earlier, I take liberty to comment on it. :) Then I will show an alternative version below.
>
> > '''d
> >
> > // D programming language
> >
> > import std.stdio;
> > import std.string;
> >
> > int main()
> > {
> >
> > string s;
>
> It is a general guideline that variables should be defined as close to their first use as possible. This allows for more readable, maintainable, and refactorable code.
>
> > char[] f;
>
> I was able to make this a string in the program below by calling the no-parameter version of readln().
>
> >
> >
> > try{
> > write("Enter file name and path: ");
> > readln(f);
> > f=strip(f);}
>
> Because errors can occur in other parts of the program as well, you can wrap the whole code in a try block.
>
> >
> > catch(Exception err){
> > stderr.writefln!"Warning! %s"(err.msg);}
>
> It is better to either return with a non-zero error code here or do something about the error. For example:
>
>   writeln("Using the default file.")
>   f = "my_default_file".dup;
>
> But I think it is better to return 1 there.
>
> >
> >
> > File file = File(f, "r");
> >
> > while (!file.eof()) {
>
> There is byLine (and byLineCopy) that produce a file line-by-line, which you can use here as well.
>
> >        s = chomp(file.readln());
> >        writeln(s);
> >     }
> >
> > file.close();
>
> Although harmless, you don't need to call File.close because it is already called by the destructor of File.
>
> >
> > return 0;
> >
> > }
> >
> >
> > '''
>
> Here is an alternative:
>
> import std.stdio;
> import std.string;
>
> int main() {
>   try {
>     printFileLines();
>
>   } catch(Exception err){
>     stderr.writefln!"Warning! %s"(err.msg);
>     return 1;
>   }
>
>   return 0;
> }
>
> void printFileLines() {
>   write("Enter file name and path: ");
>   string f = strip(readln());
>
>   File file = File(f, "r");
>
>   foreach (line; file.byLine) {
>     const s = chomp(line);
>     writeln(s);
>   }
> }
>
> Ali


I fixed the code like this and it worked without breaking words, but this time it shows single lines as if the normal context is a poem. Can we fix this or the terminal will force us and make wrapping for lines?

"https://i.postimg.cc/FHQFPgm8/Screenshot-from-2021-11-18-03-16-41.png"


import std.stdio;
import std.string;
import std.process : pipeShell, Redirect, wait;
import std.format;

int main() {

  try {
    printFileLines();

  } catch(Exception err){
    stderr.writefln!"Warning! %s"(err.msg);
    return 1;
  }


  return 0;
}

void printFileLines() {

  auto fmt = pipeShell("fmt", Redirect.stdin);
    scope (exit) {
        fmt.stdin.close;
        wait(fmt.pid);}


  write("Enter file name and path: ");
  string f = strip(readln());

  File file = File(f, "r");

  foreach (line; file.byLine) {
    const s = chomp(line);
    fmt.stdin.writeln(s);
  }
 }







November 18, 2021

On Thursday, 18 November 2021 at 01:21:00 UTC, pascal111 wrote:

>

I fixed the code like this and it worked without breaking words, but this time it shows single lines as if the normal context is a poem. Can we fix this or the terminal will force us and make wrapping for lines?

"https://i.postimg.cc/FHQFPgm8/Screenshot-from-2021-11-18-03-16-41.png"

...

>

auto fmt = pipeShell("fmt", Redirect.stdin);

try "fmt --width=120"

'man fmt' will tell you about its other arguments.

November 18, 2021

On Thursday, 18 November 2021 at 01:28:47 UTC, jfondren wrote:

>

On Thursday, 18 November 2021 at 01:21:00 UTC, pascal111 wrote:

>

I fixed the code like this and it worked without breaking words, but this time it shows single lines as if the normal context is a poem. Can we fix this or the terminal will force us and make wrapping for lines?

"https://i.postimg.cc/FHQFPgm8/Screenshot-from-2021-11-18-03-16-41.png"

...

>

auto fmt = pipeShell("fmt", Redirect.stdin);

try "fmt --width=120"

'man fmt' will tell you about its other arguments.

It works!

"https://i.postimg.cc/dtDnWpwN/Screenshot-from-2021-11-18-03-59-01.png"