Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 21 Nov 2014 18:57:28 -0200
From:      =?UTF-8?Q?fran=C3=A7ai_s?= <romapera15@gmail.com>
To:        freebsd-questions@freebsd.org
Subject:   =?UTF-8?Q?A_FreeBSD_developer_told_me_via_private_message_that?= =?UTF-8?Q?_the_the_most_FreeBSD_developers_don=E2=80=99t_develop_in_machine_?= =?UTF-8?Q?code?=
Message-ID:  <CAK_6RweaS5wEbOO8X31dcQ9i70e4Mb6U7TgCewGDkvrJrrZTxw@mail.gmail.com>

next in thread | raw e-mail | index | archive | help
This subject is irresistible to me,I like so much of this issue that get
out tears out of my eyes.

This topic is mainly for developers of FreeBSD that develop in machine
code, until even in binary code.

A FreeBSD developer told me via private message that the  the most FreeBSD
developers don=E2=80=99t develop in machine code, in other words, the  mino=
rity
FreeBSD developers develop in machine code, until even in binary code.

Thought I'd share and hope that someone can get some use out of it.

He told me this:

"We either create a macro expands to something like ".word <foo> =E2=80=9C =
or
sometimes the .word <foo> is just hard coded inline when there=E2=80=99s on=
ly going
to be one of them. Sometimes we expose them both in assembly and in C code,
in which case what we do varies a bit to accommodate the different
language=E2=80=99s syntax. It is rare, but has happened, that we only expos=
e it to
C code.

Generally, though, we try to add support for the opcodes to gas so that we
get the constraint testing it does (making sure the opcode is supported at
the level you are compiling, making sure it isn=E2=80=99t in a delay slot o=
r
violating some other precondition for its use)."

"You pointed me at macros that defined operations in terms of opcodes the
assembler didn=E2=80=99t understand with the workaround being the assembler
directive using .word followed by by some hex value to encode the opcode."

"Most developers of FreeBSD don=E2=80=99t  write directly in machine code i=
n
FreeBSD development, and don=E2=80=99t care. Some developers use the marcos=
 that I
described sometimes when doing specific, low-level coding. A handful of
developers create the marcos directly or use the .word directives in their
work to make certain things work that cannot work otherwise.

People generally don=E2=80=99t write in raw machine opcodes. That is indepe=
ndent of
FreeBSD.

However, a few, specialized people will find the need to do it from time to
time. Usually because they are porting FreeBSD to a newer processor that
needs newer opcodes to do context switching, optimize interrupt handling,
code with a new type of cache coherency, etc. These people look up the
assembler in the docs from the vendor and then create the .word workaround
to make sure things work. If they have the time, they may add it to our
somewhat ancient gas assembler as well."

"Almost nobody writes directly in binary. There are some exceptions,
sometimes though."

"> Exist university that teaches that sometimes it is necessary coding in
machine code?

My one, personal, first hand experience of being in the industry for the
last 25 years.

> If yes, what are the countries in that they exist  , if you can not speak
the names of all countries, please tell only examples.

Don=E2=80=99t know about which teaching universities do this, but here are =
several
examples I=E2=80=99ve done or seen in my career.

1) When the assembler only supports the old processors, but you are porting
an operating system to it. You need to either enhance the assembler for the
new opcodes, or you need to hand assemble them somehow. Often these are two
different skillets, so one engineer gets tasked with adding the new
opcodes, and another has to use them. Often the people using them are ahead
of the people augmenting the assembler, so they hand assemble things. In
FreeBSD=E2=80=99s case, the project chose to freeze gas at an ancient level=
, so all
new machines that have new opcodes need to be assembled by hand.
2) If you are writing a virus or other attack vector, you often times need
to hand assemble the =E2=80=9Cegg=E2=80=9D code that runs on the victim pro=
cessor. There=E2=80=99s
many variations on a theme here, including writing code that tweaks other
code to do bad things which is another form of writing machine code.
3) KERMIT. Kermit is a file transfer program written entirely in assembler
on many platforms for speed. Kermit is quite large and sophisticated, which
was a barrier to entry back in the day before the internet and most
communications protocols were standardized. To ease the transition, and
taking advantage of the .COM format in DOS, kermit came with a bootstrap
program that was made up entirely of printable characters so that one could
easily type it in (well, not so easily, but it was possible since it was
only maybe a hundred or two bytes long). The authors of this program had to
learn which assembler op codes and addressing modes lead to printable
characters and write their code accordingly. Not exactly programming
directly in machine code, but very close. It was really quite an impressive
bootstrap technique.
4) Debugging. While not directly writing in machine code, one must do the
opposite and decode instructions sometimes to understand what was happening
when a trap occurred. Most people rely on the debugger to do this. And it
works most of the time. Sometimes, though, it doesn=E2=80=99t and you eithe=
r have
to accept that you can get no useful data from the crash, or you have to
start decoding instructions to find out what went wrong.

Going back even further, there are many others. Back before there were good
consoles for computers, one had to enter a few words of boot code into the
switches on the front panel and hit run to start / boot the computer. Most
of these systems died out around the late 70 or early 80s (though as a
vestige of the old system, newer models retained the toggle switches to
allow for older techniques to work).  But I don=E2=80=99t think that=E2=80=
=99s what you
mean.

To give a concrete example of #1: gas on FreeBSD didn=E2=80=99t used to sup=
port the
EI and DI instructions for mips32r2 and mips64r2 ISAs. When I ported
FreeBSD to the Octeon processor, we wanted to make use of these
instructions. I created macros for the assembler to generate these
instructions and used them to optimize the context switching code in
FreeBSD a bit. Later, when someone else added them to gas as part of a
wholesale importing of new MIPS opcodes, I removed the macros and used the
native opcodes directly.

So while it is a useful approximation that nobody does it, people do do it,
have done it forever and there are good reasons that some very small number
of people will continue to do it into the future. I can=E2=80=99t give info=
rmation
about which universities teach this, but I do know from first hand
experience that the number isn=E2=80=99t 0."

This is all that I want to share.

The following link leads to tutorial that teaches programming Assembly in
to FreeBSD:
http://www.freebsd.org/doc/en_US.ISO8859-1/books/developers-handbook/x86.ht=
ml

Also exist tutorial that teaches programming in machine code to FreeBSD?



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAK_6RweaS5wEbOO8X31dcQ9i70e4Mb6U7TgCewGDkvrJrrZTxw>