Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 29 Jun 2006 06:03:22 GMT
From:      Spencer Whitman <swhitman@FreeBSD.org>
To:        Perforce Change Reviews <perforce@FreeBSD.org>
Subject:   PERFORCE change 100265 for review
Message-ID:  <200606290603.k5T63MfB015464@repoman.freebsd.org>

next in thread | raw e-mail | index | archive | help
http://perforce.freebsd.org/chv.cgi?CH=100265

Change 100265 by swhitman@swhitman_joethecat on 2006/06/29 06:03:09

	Commented files

Affected files ...

.. //depot/projects/soc2006/swhitman-K_Kernel_Meta-Language/k/SocTask1#4 edit
.. //depot/projects/soc2006/swhitman-K_Kernel_Meta-Language/k/cpp.c#4 edit
.. //depot/projects/soc2006/swhitman-K_Kernel_Meta-Language/k/file.c#4 edit
.. //depot/projects/soc2006/swhitman-K_Kernel_Meta-Language/k/k.c#5 edit
.. //depot/projects/soc2006/swhitman-K_Kernel_Meta-Language/k/k.h#4 edit
.. //depot/projects/soc2006/swhitman-K_Kernel_Meta-Language/k/string.c#4 edit

Differences ...

==== //depot/projects/soc2006/swhitman-K_Kernel_Meta-Language/k/SocTask1#4 (text+ko) ====

@@ -55,3 +55,163 @@
    beneficially in the FreeBSD kernel source code.
 
    Implement it
+
+
+
+____________________________________________________________________
+
+Domain specific languages have been known for over three decades and 
+has been a widely accepted paradigm for more than half of that time.
+
+So the time has come to create a DSL for kernel coding.
+
+Only problem is, we're not quite sure what it should do and migration
+is a tricky business on its own, because people are so damned
+conservative in this project.
+
+In FreeBSD we do not want to get any further into the compiler
+business than we have to, GCC has always been a major headache for
+us over the years, and we just wish we didn't have to even think
+about compilers at all.
+
+So the "real" task for the secret K-language conspiracy (largely
+myself as the GodFather with George (gnn@) and Diomedis (dss@)
+as my henchmen) is to sneak 'K' in by the backdoor, by making
+the developers life a little bit easier by every step we take.
+
+If we look at the endlösung we're aiming at, it consists of a
+"kcc" compiler which compiles .h .c and .k files into C language
+which A C-compiler, (likely GCC) will turn into executable code
+for us.
+
+One benefit appears right there:  No matter how they screw up
+GCC in the future, we have a layer where we can isolate our
+source code from these screwups.  (Or imagine the STDC people
+suddenly making "lock" a reserved word or something).
+
+It also follows from the above that 'K' itself must be a superset
+of 'C', with the footnote that the 'C' we talk about is the subset
+of STDC which we have settled on using for the FreeBSD kernel.
+(There are things in STDC we don't use in the kernel.  Trigraphs,
+floating point and wide characters being the most prominent examples.)
+
+So in order to get anywhere, we need to do is to be able to insert
+a program in compiler chain which will not affect the compilation,
+but which will give us a place to start implementing and experimenting
+with the K extensions to C.
+
+Inserting such a program will slow compilation down a bit,
+so we need to bring some benefit to justify this slowdown.
+
+But there is another avenue in:  In FreeBSD we have the style(9)
+coding style, and we could gain some traction if provided
+a program which would warn about transgressions on style(9)
+in the same way as lint(1) warns about transgressions on C.
+
+This is a less heavy burden to lift because we do not need to
+generate code, only messages based on our analysis.
+
+
+This is where we are right now:  trying to write that program
+and trying to identify and implement those benefits.
+
+
+The code is basically a small lexer&parser for the FreeBSD subset of STDC.  
+If you run the FreeBSD kernel sources through a CPP macroprocessor
+first, my code will lex and parse the kernel sources correctly.
+
+It doesn't generate any code at this point, it merely avoids 
+barfing.
+
+
+So your first task is to implement the necessary CPP macro processor
+facilities so that we can avoid using an external CPP to run FreeBSD
+kernel sources through.
+
+This basically means #define, #if, #ifdef ... #endif and macro
+expansion.
+
+The good news is that it shouldn't take too long, CPP is a pretty
+simple concept, although some of the STDC decisions fouls up some
+corner cases.
+
+The bad news is that there always seems to be some piece of code
+which relies on any particular weird corner case of the CPP language.
+
+
+Next step is to look for tangible benefits.
+
+The #! expansion is my first guess (but better ideas are very
+welcome!) and after that we should probably see if we can detect
+unused #include files (a continuing problem in FreeBSD) and after
+that look for things in style(9) which we can detect with the
+full cpp/lexer/parser combo.
+
+However, this is merely my ideas, and if you have or come across
+better ideas I am all ears.
+
+
+I hope you understand that all these weird restrictions are not
+put in place to make your life miserable.  Introducing a new
+language for kernel coding in a conservative project like FreeBSD
+takes some careful planning and there are many toes we need to
+avoid stomping on.  But 14 years of experience with this crew
+has taught me that making their life easier in the long run will
+always win the hearts in the end.
+
+
+Now that you have studied the code a bit, I hope you can see how I
+tried to avoid copying data around more than necessary, for instance
+by pointing from the tokens into the original file rather than copy
+them into the to token structure etc.
+
+This is an attempt to try to drag modern performance programming
+practice into the compiler, in the hope that we will end up with
+a compiler which can run very efficiently on modern multi-core
+cpus.
+
+A traditionally particioned compiler like GCC runs as three
+processes with pipes between them:
+
+	cpp | cc1 | asm
+
+The pipes means that the process has to dive into the kernel,
+fiddle around with locking there etc.
+
+In the K compiler, I still want to have distinct stages for reasons
+of structure, but I want them to live in the same process and hand
+data over without bothering the kernel if at all possible.
+
+
+The other thing which is important to me is that we build a graph
+so we can track backwards for error reporting. 
+
+Some of the more horrible macros can give quite unhelpful diagnostics
+if used wrong, because the error is emitted from one of the middle
+layer of a stack of macro expansions.
+
+I would like the compiler to emit very detailed error messages, showing
+step by step how it ended up with the tokens it tried to process.
+Something like this mock-up of an error message:
+
+   Syntax error:  Identifier expected, found floating number:
+	4.56 += 3.14;
+	^^^^
+   expanded from macro ADDFP(a,b)
+	defined at fooinclude.h line 8
+   called from fooinclude.h line 12
+	ADDFP(4.56, 3.14);
+   expanded from macro PLUSPI(aa)
+	defined at fooinclude.h line 9
+   called from mymacros.h line 123
+	g = PLUSPI(4.56);
+   #included from mysrc.c line 4
+        #include "mymacros.h"
+
+To do this, you have to build a tree for the macro expansions so
+that you can backtrack to generate these messages.  But do keep in
+mind, most of the time the messages will not be emitted, so you
+should design the tree to be fast in the normal case where all that
+info will never be used.  It doesn't matter if linear searches are
+necessary to generate the diagnostic message, the programmer will
+be wasting far more time to fix the mistake any way.
==== //depot/projects/soc2006/swhitman-K_Kernel_Meta-Language/k/cpp.c#4 (text+ko) ====


==== //depot/projects/soc2006/swhitman-K_Kernel_Meta-Language/k/file.c#4 (text+ko) ====

@@ -19,9 +19,12 @@
 	int fd;
 
 	filename = String(filename, NULL);
+
+	/* Check if this file has been loaded already */
 	TAILQ_FOREACH(s, &sourcefile_head, list)
 		if (s->filename == filename)
 			return (s);
+
 	fd = open(filename, O_RDONLY);
 	if (fd < 0)
 		return (NULL);

==== //depot/projects/soc2006/swhitman-K_Kernel_Meta-Language/k/k.c#5 (text+ko) ====

@@ -111,21 +111,32 @@
 	struct h *hf, *hg;
 	char *p;
 
+	/* Set up print stuff */
 	register_printf_render('T', printf_render_token, printf_arginfo_token);
 	register_printf_render_std("HVQM");
 	setbuf(stdout, NULL);
 
+	/* Set up string tokens */
 	InitString();
 
+	/* Create a new list of tokens and initalize the symbol lists*/
 	hg = NewH();
 	hg->sym = NewSymScope();
 
+	/* Initalize type information */
 	InitTypes();
 
 #if 0
 	CppIarg("-I/usr/include");
 #endif
-
+	/* Get command line arguments 
+	 * D: Not implemented
+	 * U: Not implemented
+	 * I: Include file optarg
+	 * W: Not implemented
+	 * c: Not implemented
+	 * default: print usage
+	 */
 	while ((ch = getopt(argc, argv, "cD:U:I:W:")) != -1) {
 		switch (ch) {
 		case 'D': CppDUarg(hg, optarg, 1); break;
@@ -140,27 +151,38 @@
 	}
 	argc -= optind;
 	argv += optind;
+	/* Exit in case of no file */
 	if (argc < 1)
 		errx(1, "Missing file argument(s)");
 
 	for (ch = 0; ch < argc; ch++) {
-printf("argv[%d] = %Q\n", ch, argv[ch]);
-		p = strrchr(argv[ch], '.');
-		if (p == NULL)
-			errx(1, "No '.' in filename %Q", argv[ch]);
-		if (p[1] == 'h') {
-printf("H file\n");
-			hf = hg;
-		} else if (p[1] == 'c') {
-printf("C file\n");
-			hf = NewH();
-			hf->sym = hg->sym;
-			PushSymScope(hf);
-		} else
-			errx(1, "Unknown filename suffix %Q", p);
-		Cpp(hf, argv[ch]);
-		if (0) 
-			DumpRefs(stdout, hf);
+	  printf("argv[%d] = %Q\n", ch, argv[ch]);
+	  
+	  /* Determin what type of file has been passed to K */
+	  p = strrchr(argv[ch], '.');
+	  
+	  if (p == NULL)
+	    errx(1, "No '.' in filename %Q", argv[ch]);
+	  
+	  if (p[1] == 'h') {
+	    printf("H file\n");
+	    /* Use hg's token and symbol lists */
+	    hf = hg;
+	  } else if (p[1] == 'c') {
+	    printf("C file\n");
+	    /* Create a new list of tokens */
+	    hf = NewH();
+	    /* Set the symbol list to hg's */
+	    hf->sym = hg->sym;
+	    /* Add a new symbol scope to hf */
+	    PushSymScope(hf);
+	  } else
+	    errx(1, "Unknown filename suffix %Q", p);
+	  
+	  Cpp(hf, argv[ch]);
+	  
+	  if (0) 
+	    DumpRefs(stdout, hf);
 		if (0)
 			DumpTokens(stdout, hf);
 		if (p[1] == 'c') {
@@ -169,7 +191,7 @@
 			PopSymScope(hf);
 		}
 	}
-
+	
 	return (0);
 }
 

==== //depot/projects/soc2006/swhitman-K_Kernel_Meta-Language/k/k.h#4 (text+ko) ====

@@ -5,8 +5,8 @@
 /* -------------------------------------------------------------------*/
 
 struct s {
-	const char		*b;
-	const char		*e;
+        const char		*b; /* Begining of a file (in file.c) */
+        const char		*e; /* End of the file (in file.c) */
 	struct ref		*r;
 };
 

==== //depot/projects/soc2006/swhitman-K_Kernel_Meta-Language/k/string.c#4 (text+ko) ====

@@ -36,7 +36,7 @@
 {
 	struct string *s;
 	struct string_head *h;
-	unsigned l, hash;
+	unsigned l, hash; /* XXX hash is unused here */
 
 	assert(b != NULL);
 	if (e == NULL) {
@@ -50,6 +50,7 @@
 	hash = *b;
 	if (l > 1)
 		hash = (hash << 8) | b[1];
+	/* Have we already inserted this string into the hash table? */
 	h = &strings[*b % NHASH];
 	LIST_FOREACH(s, h, list) {
 		if (b == s->string)



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200606290603.k5T63MfB015464>