woods@weird.com (Greg A. Woods) wrote: > Subject: Re: [BUG] file Truncate is it really busted? > > woods@weird.com (Greg A. Woods) replied to my explanation of the > 'f' prefixes in C file handling functions. > > Well there's also the fact that the two uses of the 'f' prefix are vvvvvvvvvvvvvvvvvvvvvvv > really in fundamentally different levels of the API -- or depending on ^^^^^^^^^^^^^^^^^^^^^^^ > your point of view maybe even completely different APIs. > > Fundamentally different? Note I said "completely different", not "fundamentaly different" ;-) Nope. You wrote _both_.
> Some of them are in the kernel, and some of them > are in the C stdio library, That is to SUPPORT your claim that they are at 'different levels of the API'.
Well on most unix and unix-like systems both the kernel system call functions and the stdio functions are in libc.
Half true. Glue code for the system calls can be found in libc. There is an observable difference. If you use dis(1), you can find out what the actual code of fopen() is, and you can step through it in the debugger. You can't do that with open().
But again, that's to SUPPORT your claim.
If the system call interfaces had ever been supplied separately it might have been in a library called libsys. Libc has become a mess of a whole slew of other very much unrelated APIs.
> but from the point of view of your average UNIX > programmer this is a distinction without a difference. Perhaps that's true, though I don't really believe anyone with even a small amount of C programming experience could confuse the set of kernel interface functions that deal with file descriptors and the set of "higher" stdio functions that deal with the buffered file I/O using pointers to structures in which the library maintains internal state. It's obvious you don't teach students. Believe it. In many ways the stdio functions are NOT "higher"; a lot of the interesting stuff like locking can't really be done through them.
Let's face it, it's pretty hard to argue that "dealing with files" amounts to "completely different APIs", especially when the names are by design very very similar.
This thread started because someone *WAS* confused by the "f" prefix, and expected a convention from one part of the UNIX I/O library to apply to a function that actually came from another part of the UNIX I/O library.
> open() and fopen() > are *both* in POSIX. Yeah, sure, but that's really got nothing whatsoever to do with their differences..... POSIX defines the API for many unrelated sets of functions. Yes it has. They are both "open a file" functions and they are both in all the standards that include "open". POSIX defines one API that covers many topics, and the fact that there is not a consistent naming convention in that single API is precisely what this thread is about.
> A UNIX look-alike could quite legitimately place > fopen() in the kernel and open() in a library, just like it is on my Mac > at home. (Yep, Think C layered the UNIX functions on top of the stdio > functions, and they were layered on top of the MacOS ones.) Hmmm... that's not really an important distinction in a single user system where the hardware doesn't properly isolate the data and code of the system from the data and code of user programs.
Nothing in POSIX requires the hardware to isolate code and data. Nothing at all. I have used a "UNIX" implementation where everything was mapped into a single address space. This had snags, like you couldn't save pointers into a file and load them back later, because every time a program was run it might start at a different address.
The implementation of a difference between lower-level "kernel" file access functions and a layer of buffered I/O functions in a single address space, especially in C or something even lower-level, kind of makes all these distinctions artificial. I think that was my point. "all these distinctions artificial" doesn't sound much like "fundamentally different ... completely different".
> Yes, I know. That's pretty much what I said. That's the PROBLEM. I don't see how there can be a problem so long as you keep a separate view in your mind of the kernel API and the stdio API, and never attempt to mix the two without first gaining a deep understanding of the potential interactions of the particular mix you contemplate. The whole point of this thread is that the call to ftruncate() in a version of Squeak is wrong because the author saw the "f" prefix and thought it meant "takes a FILE* instead of an int" when in fact it meant "takes an int instead of a char*".
In POSIX as it stands now, there _isn't_ any "kernel API" and there _isn't_ any "stdio API", there is _one_ API with many functions, including two different ways to access files, and for nontrivial applications you have to use both of those ways, because each of them can do something the other can't.
It seems as though having a "direct" and a "buffered" layer is something operating system designers think we can't do without: VMS has both RMS and QIO, and MVS has something similar. On the other hand, the B6700 MCP managed without such a distinction, and arguably CMS too.
There is an important design lesson here, which is that naming matters.
Since we're discussing this in a forum and in a context that relates to interfaces in an object-oriented system how about we simply declare that the lower level file-descriptor functions are in one class, and that the stdio functions are in another class (perhaps a richer class derived from the former one).
How about we don't? C doesn't have classes; and the C++ interface to POSIX doesn't do it that way.
Stdio provides buffering and formatting; descriptor I/O doesn't. Descriptor I/O provides memory mapping, locking, truncation, and synchronisation; stdio doesn't. It is very hard to argue that one is richer than the other. This is part of the problem (the one that actually happened, remember?) Many programs in a UNIX environment require *both* layers, sometimes with the same file.
> The getchar(), getc(), putchar, putc(), printf(), and scanf() > functions, amongst others, operate on the internal elements of a FILE > structure, but don't have a FILE * argument. Ah, now you're taking the analogy backwards and far too far. No, I am (a) exhibiting a logical error in your argument, and (b) pointing out that it is this very thinking "'f' prefix means FILE* argument" that actually led to a real programming error in a version of Squeak. You're perhaps confusing naming conventions between two unrelated APIs. There's no direct connection between the 'f' prefix and the fact that the function is a stdio function.
This is very insulting. My first posting in this thread made it quite clear that I understood the distinction. The posting to which Woods was replying made it even clearer that I understood stdio quite thoroughly.
I am not confusing naming conventions between two unrelated APIs. SOMEONE *ELSE* was confused by the fact that two STRONGLY RELATED parts of the unified POSIX API used the same prefix for opposite purposes.
> I'm sorry, but this is precisely why it WOULD make sense to have fftruncate(), > because ftruncate(fileno(fp)) just plain doesn't work. I *think* the > following code will work, but of course it isn't portable to systems that > don't have f{un,}lockfile(): Ah, I see -- you have some rather extravagant expectations! :-) I don't regard "it should work correctly" as extravagant.
'ftruncate(fileno(fp))' does in fact work, perfectly even -- it just doesn't leave the FILE* pointed to by 'fp' in any kind of useful state, and expecting it to do so is perhaps unrealistic since you're mixing too many operations between different levels. WRONG. That's not the point at all. The point is that a call to ftruncate(fileno(fp)) TRUNCATES AT THE WRONG PLACE! If the last operation was an output, the file may be left too short. If the last operation was an input, the file may be left too long.
Generally speaking any experienced Unix programmer will endeavour to never mix stdio operations on a given file with low-level file descriptor I/O operations on the same file.
Endeavour, yes. The point is (once again) that POSIX provides in its unified API two different ways of accessing files, neither of which provides all the operations that the other does, using inconsistent naming conventions, and that sometimes you need to do an operation from one set to a file that was opened using the other set, and this has in recent days led to an actual mistake by an actual person OTHER THAN ME in Squeak.
There is a practical issue for Squeak, and a design issue.
Practical issue: Run the VM through lint or lclint, and track every issue to its cause. Be very very careful of functions that start with 'f'. Watch for mixing approaches: UNIX systems often have more than one way to do threads, or more than one way to do semaphores, or more than one way to do memory mapping.
Design issue: The only thing worse than the POSIX API is the Windows API. We must imitate neither.
As Smalltalk programmers, we need to be aware of common protocols, and avoid names like 'next' or 'size' unless our methods do what someone else familiar with the protocol would expect; and if we have something where a common name would make sense, we should not invent another.
By now Squeak has about as many classes as UNIX has functions, so modules are coming along just in time to avoid naming issues like this for classes. A great big SHOUT of THANKS to the people working on modules!
squeak-dev@lists.squeakfoundation.org