Closer look at the registers on an ARM cpu

In my dabbling with ARM assembly so far (my most recent achievement was completing my simple sorting algorithm – last update here), I had picked up that there’s 16 general purpose 32bit registers for holding either values or addresses, R0 through to R15, but paying a closer look I realized some of these have specific purposes, and or uses by convention.

R0: function argument or result

R1-R3: function args

R4-12: general purpose

R13: SP – this is the stack pointer

R14: LR – Link Register – it’s holds the address to branch back to when you call BR LR

R15: PC – the Program Counter – address pointing to the current instruction being processed

More info in the great summary on ARM Assembly here, and also in the ARM11 tech ref here.

(Page views: 170)

Ubuntu 14.04 with nvidia drivers

I’ve been on a kick installing various flavors of OS recently (I’ve been repurposing an older desktop and starting with an empty hdd). In the past the brownish/orange colors of Ubuntu have just put me off, and the Unity desktop I thought was just a bit too unusual to be useful. So I started with Mint Cinnamon, That’s been my main desktop OS for a couple of months. Then I started looking at Fedora 22. This gave me no end of installation pain.

I’m installing on a HP Pavillion with an AMD quadcore, and nvidia 6150 onboard graphics. Seems this older gpu is killing me. Fedora 22 hangs on install around 33%. Fedora 21 will install in simple graphics mode. Trying to get the nvidia graphics installed though gave me a few late nights. No matter which instructions I’d follow, I could not get the nouveau graphics unloaded, and so would always get the error messages about the nouveau kernel modules are still loaded. I tried various tips from online sources, and eventually gave up.

A while back I noticed the noobslab site with a easy to follow apt-get steps to install new themes for Ubuntu. Huh, so if I can install a different theme then I can get rid of the brown default theme? I’ve played with Ubuntu Tweaks a while back and didn’t spend enough time playing with it to end up with something that I liked. but ready to give it another go. So I installed Tweak:

sudo add-apt-repository ppa:tualatrix/ppa
sudo apt-get update
sudo apt-get install ubuntu-tweak

Installed Crunchy themes from noobslab, and now I’m all set. Looks pretty cool too… this will do for a while.

My previous steps for installing nvidia-304 work fine on Ubuntu 14.04 too. So all set.

(Page views: 31)

Windows 95 launched 20 years ago today

On August 24th 1995 Windows 95 was launched. The minimum required specs were:

  • a 386 cpu (around 25MHz at the time?)
  • 4MB RAM
  • 55MB free disk space

Fast forward to 2015. The minimum specs for Windows 10 are:

  • a 1 GHz cpu
  • 1 GB RAM for 32-bit or 2 GB for 64-bit (really? I find this hard to believe, 4GB probably a realistic minimum, 8GB to be comfortable)
  • 16 GB disk space for 32-bit OS, 20 GB for 64-bit OS

Somehow this is supposed to be better, but I’m not sure exactly how. Running an OS in 4MB? How exactly was that possible?! That seems impossible by today’s standards. And yet, Windows 10 is so much better? Better at needing more resources than any previous Windows version?

If there’s one thing for sure, no-one has danced with such enthusiasm for the launch of a new operating system in the past 20 years since Steve Balmer, doing whatever he was doing here.

And yes, Windows 95, with your blue screens and all, “you made a grown man cry”.

I’ve always wondered what was the intent of using the Stones “Start Me Up” as the theme music for Windows 95. Yes ok, start button, “Start Me Up”, ok, I get it. But didn’t they listen to the rest of the lyrics? Maybe they were thinking crying in happiness, but in reality it was more often crying in despair :-0

(Page views: 36)

Implementing simple sort algorithms in ARM Assembly (part 3)

I finished the first rough version of my simple sort algorithm in ARM Assembly (see part 1 and part 2 of my updates). Here it is so far (prior to some cleanup and optimization):

/*
R0 address of string used with printf ti output %d
R4 address of numbers to sort
R5 current number to be compared
R6 offset index for outer loop through numbers
R7 offset index for inner loop
R8 current smallest identified value
R9 current offset index of next uncompared value
*/
.global main
main:
  push {ip, lr}
  MOV R6, #0 @outerloop offset to numbers to be sorted
  MOV R7, #0 @innerloop offers to number to be sorted
  MOV R9, #0 @init value for index to next uncompared value
outerLoop:
  MOV R8, #99 @reset large default for next loop comparison
  MOV R7,R6 @copy outerloop offset to next starting offset for the innerloop
innerLoop:
  LDR R0, =output @load addr of output string
  LDR R4, =nums @load addr of nums to compare to R4
  LDR R5,[R4,R7] @load current num to R5 from R4 with offset R7
  MOV R1,R5 @move num for output
  BL printf
  CMP R5,R8 @is current < smallest so far
  BLT swapSmallest @if true, swap smallest to current first position then continue
continue:
  CMP R7,#16 @ 0 plus 4*4bytes for 5 entries in array
  ADD R7, R7,#4 @inc offset by 4 bytes
  BLT innerLoop
continueOuterLoop:
  CMP R6, #16 @check if we've looped through all values
  ADD R6, R6, #4
BLT outerLoop @if not, branch back to start of outer loop
_exit:
  POP {ip, lr}
resetLoopOffsets:
  MOV R7, #0 @reset loop counter
writeFinalSoredList: @TODO: this is a near copy of the innner loop - refactor this to function
  LDR R0, =writeSorted @load addr of output string
  LDR R4, =nums @load addr of nums
  LDR R5,[R4,R7] @load current num to R5 from R4 with offset R7
  MOV R1,R5 @move num for output
  BL printf
  CMP R7,#16 @ 0 plus 4*4bytes for 5 entries in array
  ADD R7, R7,#4 @inc offset by 4 bytes
  BLT writeFinalSoredList
doExit:
  MOV R1, #0
  MOV R7, #1
  SWI 0
swapSmallest:
  MOV R8,R5 @keep copy of smallest in the current loop
  LDR R10, [R4,R6] @tmp copy first position to R10
  LDR R11, [R4,R7] @tmp copy value in position currently being compared
  STR R10, [R4, +R7] @swap first position value to current position being compared
  STR R11, [R4, +R6] @swap the current smallest value into the current first position
  BX lr @return
.data
nums:
.word 5,2,7,1,8
output:
.asciz "%d\n"
writeSorted:
.asciz "%d\n"

Complete source if you want to grab a copy is in github here.

To get this far I learned plenty about ARM architecture – over time it has evolved and there are many different versions, and different ARM based CPUs implement different architecture versions. To make things more complicated, the naming scheme is a bit confusing.

The ARM CPU in the Raspberry Pi is a Broadcom BCM2835 System on a Chip (SoC), which includes an ARM1176JZF-S (ARM reference manual here). This is an ARM11 core, based on ARMv6 architecture.

Interest points about the ARMv6 instructions (not a comprehensive summary, but some rough notes to refer back to later):

  • The majority of instructions are structured ‘instruction destination, source’ but the STR (Store) for some reason is reversed so it is ‘instruction source, destination’
  • LDR (Load Register), can take a source as a label to a constant, or prefixed with ‘=’ which takes the address in memory where the constant is located.
  • LDR can move the value that is pointed to by an address in another register, using [Rn], and can also be coupled with an offset as a second argument, [Rn, Rm]

I’ll probably spend some time to see if I can clean up the code some more, but I’m happy with this so far.

(Page views: 43)

‘New’ Windows 10 features inspired from other OSes

Every time I hear or read about one the ‘new’ features introduced in Windows 10, I can’t help but think ‘hang on, hasn’t [Linux|OS X] already had that feature for years?

The past few OS X releases have been minimizing the visual window decorations to have borderless windows and minimal icons, and Ubuntu 14.04 has done the same too (since it was released over a year ago). Seems to be the current fashion. I’m not a historian of UI design, but I had to dig back to around 2011 when OS X Lion I think introduced borderless windows, although windows in OS X have had this look n feel for long enough now that I’ve taken it for granted as normal.

One the features I’ve always missed in Windows that 10 now has is multiple virtual desktops, because this is something I always use in OS X and Linux – it seems like it’s always been there, and again, one of those features you take for granted. I like to keep related windows for one task on one desktop and windows for another task on another. Anyway, welcome to 1990’s, Windows 10.

If you want to see some more examples, itsfoss.com has a comparison of other ‘new’ Windows 10 features that have been borrowed from Linux.

(Page views: 36)

Implementing simple sort algorithms in ARM Assembly (part 2)

I haven’t completed the code yet, but I wanted to share my progress learning ARM assembly by implementing a simple sort algorithm (part 1 is here). I’m committing my changes as you go so if you’re interested you can also pull the code form github here.

The simple sort that I’m implementing is a ‘comparison sort‘. You start at lowest end of the array of values, iterate through to find the smallest value and then switch the smallest found value to the front. You then repeat the loop starting at the next index in the array, search again for the smallest, switch, and then continue repeating this until you’ve looped through and compared all values.

I’ll make clear that as I’m learning ARM ASM I’ve no idea at this point if my approach to implementing this algorithm is optimal, but I’m finding it a useful learning exercise. At this point I’m also finding debugging the code in Eclipse C++ indispensable – I don’t think a this point I could debug the code without an IDE (or to try would be difficult and error prone). Once you’ve walked through the steps to crosscompile in Eclipse C++ you can use the same setup to remove debug in Eclipse C++ too, with the executable running remote on the Raspberry Pi.

So far I have the outer and inner loops working, so can iterate through the values, and compare to find the smallest value on each iteration. I’ll post another update once I’ve got the swapping done. In the meantime if you’re interested you can take a look at my latest commit in my github repo above.

(Page views: 42)

Uncle Bob: “Make the Magic Go Away” – why you should learn some Assembly

I’ve been spending some spare time learning some ARM Assembly (and sharing from of my experiences here, here and here).

In the early 90s at college I did a module on 68000 Assembly on the Atari ST, but I haven’t done any since. I remember being amazed at how complicated and it was to implement even the most simplest of code, since you’re dealing with a very limited set of instructions, using instructions that the CPU itself understands. At the same time though you gained an insight into what goes on under the covers, how the computer itself works – how the CPU’s registers are used, and how data is transferred from registers to memory and vice versa. It’s computing at it’s most elemental level, you’re working with the bare metal hardware.

Since I’ve also recently been playing around with random stuff on the Raspberry Pi, I thought I’d take a look at the ARM CPU and learn some ARM Assembly. I felt a need to get back to basics and learn about the architecture of ARM CPUs and what makes them tick. As much as this sounds pretty hardcode and crazy, ARM CPUs are showing up pretty much everywhere and you probably don’t even know it. There’s a good chance at least one if not more of you mobile devices you currently and have owned over the past few years has been powered by an ARM CPU. So given the memory and CPU contraints of small form factor devices, and also IoT type devices, it’s not completely off the wall to be interested in learning some ARM Assembly.

Anyway, back to my original point. If you want to understand what makes a computer tick (literally), you can’t go far wrong by learning some Assembly. You’ll get a far better understanding of what goes on under the covers, and a new appreciation of just how much abstraction there is in today’s high level languages (Java, C#, Objective-C etc) – how much they do for you without you even really have to know what’s going on under the covers. But if you really want to get a deeper understanding, you lift the hood/bonnet and start poking around in the engine, right?

It surprised me when I came across this post by Uncle Bob recently:

http://blog.8thlight.com/uncle-bob/2015/08/06/let-the-magic-die.html

Bob comments on the continual search within the industry to find the perfect language or library. We’re continually re-inventing languages and frameworks, but there’s really nothing revolutionary different being ‘invented’ – they’re all solving the same problems, and not really offering anything new. Bob even goes as far to say there really hasn’t been anything new in computer languages for 30 years.

The unusual thing is that we seem to get caught up in the promise that maybe the next big language or framework ‘solves all problems’ and does it better than all other languages and frameworks before, but there’s still really nothing new.

Bob’s point:

But there is no magic. There are just ones and zeros being manipulated at extraordinary speeds by an absurdly simple machine. And that machine needs discrete and detailed instructions; that we are obliged to write for it.

And continues:

I think people should learn an assembly language as early as possible. I don’t expect them to use that assembler for very long because working in assembly language is slow and painful (and joyous!). My goal in advocating that everyone learn such a language is to make sure that the magic is destroyed.

And here’s it is, the reason why you should learn Assembler:

If you’ve never worked in machine language, it’s almost impossible for you to really understand what’s going on. If you program in Java, or C#, or C++, or even C, there is magic. But after you have written some machine language, the magic goes away. You realize that you could write a C compiler in machine language. You realize that you could write a JVM, a C++ compiler, a Ruby interpreter. It would take a bit of time and effort. But you could do it. The magic is gone.

I don’t know exactly what prompted me recently to start learning Assembler, but these comments from Uncle Bob resonated with me. If you don’t know how a computer works, how do you expect to understand what is going on when you develop code to run on it?

So there you go. Bob said it. Go learn Assembler. Maybe you’ll learn something.

 

(Page views: 60)