From: pottier@clipper.ens.fr (Francois Pottier) Subject: C.S.M.P. Digest, Issue 3.006 Date: Fri, 18 Mar 94 16:15:51 MET C.S.M.P. Digest Fri, 18 Mar 94 Volume 3 : Issue 6 Today's Topics: 68K emulation and PPC toolbox questions. AE coercion handlers ? ARGH!!! Whamwhamwham!! (DialogSelect stuff) An offering: Assembly language code for a high speed copybits Animation speed: here we go again... Animation speed: improvement! Animation speed: more info Animation: the story continues... Blank Screen? C code for scrollable application help Can C be as fast as Assembler? (next...on the McLaughlin Group) Color Terminal Emulator DrawString + Ellipsis character ? Finder comments on non-Desktop DB volumes? Finding System Folder (Again) Free code: Sean's window manager How does the Finder handle events? How to determine color of progress bar? How to draw inits icons? How to tell a Mac from a Mac? I know the Mac MM isn't reentrant, but: If you use SysBeep() for debugging... Improving DrawText speed (Was: Color Terminal Emulator) Interface guidelines for extra program files Intermixing graphics and text Let's kill 24-bit mode! (was Re: Let's kill System 6!) Let's kill system 6! Never beep when using GWorlds. System software bug! PPC & 68k UPP problems PPC binaries Passing data through to completion procs? Password editing item.. Tricky? Permanent front windows... Preference file question! Reading PICT files != 72 dpi. How ? Resources on PowerPC Safer Segments ? Speeding up animation; questions System Folder on NONstartup disk Trap dispatcher overhead User in a menu? What happens if my Vertical Retrace task takes too long? When to StripAddress? (was Re: Let's kill 24-bit mode!) Why can't I have AEs *in* AEs? Why use handles at all, though? Writing To Screen Memory dirIDs jGNEFilter Q password encryption The Comp.Sys.Mac.Programmer Digest is moderated by Francois Pottier (pottier@clipper.ens.fr). The digest is a collection of article threads from the internet newsgroup comp.sys.mac.programmer. It is designed for people who read c.s.m.p. semi- regularly and want an archive of the discussions. If you don't know what a newsgroup is, you probably don't have access to it. Ask your systems administrator(s) for details. If you don't have access to news, you may still be able to post messages to the group by using a mail server like anon.penet.fi (mail help@anon.penet.fi for more information). Each issue of the digest contains one or more sets of articles (called threads), with each set corresponding to a 'discussion' of a particular subject. The articles are not edited; all articles included in this digest are in their original posted form (as received by our news server at nef.ens.fr). Article threads are not added to the digest until the last article added to the thread is at least two weeks old (this is to ensure that the thread is dead before adding it to the digest). Article threads that consist of only one message are generally not included in the digest. The digest is officially distributed by two means, by email and ftp. If you want to receive the digest by mail, send email to listserv@ens.fr with no subject and one of the following commands as body: help Sends you a summary of commands subscribe csmp-digest Your Name Adds you to the mailing list signoff csmp-digest Removes you from the list Once you have subscribed, you will automatically receive each new issue as it is created. The official ftp info is //ftp.dartmouth.edu/pub/csmp-digest. Questions related to the ftp site should be directed to scott.silver@dartmouth.edu. Currently no previous volumes of the CSMP digest are available there. Also, the digests are available to WAIS users as comp.sys.mac.programmer.src. ------------------------------------------------------- >From herman@ece.cmu.edu (Herman Schmit) Subject: 68K emulation and PPC toolbox questions. Date: 16 Mar 1994 19:32:32 GMT Organization: Electrical and Computer Engineering, Carnegie Mellon When doing 68K emulation, will a Power Mac do emulation of 68K toolbox routines, or does the emulator detect those routines and execute the PowerPC toolbox code for that routine? I'm also curious exactly why there will not be a 68K->PowerPC machine code translator in addition to an emulator. I always thought that the problem with translating between CPUs was not caused by different instruction sets but by different system/OS calls and different memory models. If the toolbox will be the same, the system/OS calls are no longer a problem. Do the Power Macs have a significantly different memory model? Even if they have a different memory model, couldn't you do some sort of emulation of the 68K memory and translate the everything else into PPC native code? Or is this what is done? herman +++++++++++++++++++++++++++ >From rang@winternet.mpls.mn.us (Anton Rang) Date: 17 Mar 1994 05:13:49 GMT Organization: Minnesota Angsters In article <2m7msg$s9f@fs7.ece.cmu.edu> herman@ece.cmu.edu (Herman Schmit) writes: >When doing 68K emulation, will a Power Mac do emulation of 68K toolbox >routines, or does the emulator detect those routines and execute the >PowerPC toolbox code for that routine? The emulator detects A-traps and dispatches them through the trap table, as happens on 68K machines. If the trap is written in PowerPC code, the Mixed Mode Manager switches to PowerPC native mode (if needed) and then runs it. Similarly, if the trap is in 68K code, the MMM switches to 68K emulation and then runs it. >I'm also curious exactly why there will not be a 68K->PowerPC machine >code translator in addition to an emulator. I always thought that the >problem with translating between CPUs was not caused by different >instruction sets but by different system/OS calls and different memory >models. First, there is at least one third-party translator available, FlashPort by Echo Logic. But it requires some input from the developers to do a good job; you can't just drag-and-drop onto it. Second, both problems exist. It's not trivial to convert between different code models *and get reasonable performance*. It can be done -- VEST on DEC's Alpha VMS machines is truly incredible! -- but it's very state-of-the-art and probably would have cost Apple many millions to develop.... -- Anton Rang (rang@winternet.mpls.mn.us) +++++++++++++++++++++++++++ >From zstern@adobe.com (Zalman Stern) Date: Fri, 18 Mar 1994 11:55:45 GMT Organization: Adobe Systems Incorporated Anton Rang writes > In article <2m7msg$s9f@fs7.ece.cmu.edu> herman@ece.cmu.edu (Herman Schmit) writes: > >When doing 68K emulation, will a Power Mac do emulation of 68K toolbox > >routines, or does the emulator detect those routines and execute the > >PowerPC toolbox code for that routine? > > The emulator detects A-traps and dispatches them through the trap > table, as happens on 68K machines. If the trap is written in PowerPC > code, the Mixed Mode Manager switches to PowerPC native mode (if > needed) and then runs it. Similarly, if the trap is in 68K code, the > MMM switches to 68K emulation and then runs it. In addition, some traps are "fat" and provide both 68K and PowerPC code. This avoids the overhead of a mixed-mode switch for very small routines. (Like say SetPort and GetPort.) All of the above applies to any routine descriptor, not just the ones placed in the trap table. The most common use is for callbacks passed to toolbox routines, however they can be used within your own code as well. (There is special support in Mixed Mode to handle the dispatching and calling conventions of traps of course.) -- Zalman Stern zalman@adobe.com (415) 962 3824 Adobe Systems, 1585 Charleston Rd., POB 7900, Mountain View, CA 94039-7900 "Do right, and risk consequences." Motto of Sam Houston (via Molly Ivins) --------------------------- >From paulr@syma.sussex.ac.uk (Paul Russell) Subject: AE coercion handlers ? Date: Wed, 2 Mar 1994 16:31:54 GMT Organization: University of Sussex Am I right in thinking that there are no default system AE coercion handlers ? Not even for such basic conversions as typeChar<->typeExtended ? Is there a utility for displaying the installed coercion handlers ? Are there any available handlers for common types, either as source or as an extension ? //Paul -- | Paul Russell | Internet: P.T.Russell@sussex.ac.uk | | Experimental Psychology | AppleLink: EP.SUSSEX | | Sussex University, Falmer | Telephone: +44 273 678639 | | Brighton BN1 9QG, England | Facsimile: +44 273 678433 | +++++++++++++++++++++++++++ >From jwbaxter@olympus.net (John W. Baxter) Date: Wed, 02 Mar 1994 12:43:30 -0800 Organization: Internet for the Olympic Peninsula In article <1994Mar2.163154.19747@syma.sussex.ac.uk>, paulr@syma.sussex.ac.uk (Paul Russell) wrote: > Am I right in thinking that there are no default > system AE coercion handlers ? Not even for such > basic conversions as typeChar<->typeExtended ? There are a bunch of coercions built in to the Apple Event Manager. As it happens, typeChar<->typeExtended is one of them. They are listed in Inside Mac: IAC in table 4-1, which occupies ALL of pages 4-43 and 4-44. There are more "exotic" ones built in, too, such as typeAppleEvent --> typeAppParameters (by far the easiest way to build THAT monster). > Is there a utility for displaying the installed > coercion handlers ? Yes...there is a little FKey. It puts up a nice list of all the Application handlers (event, coercion, object extraction, etc) for the front application, and all the System handlers. It is broken when the current Finder is in front [if it hurts, don't do it]. It's on the Developer CDs...since it is there, I don't know where else it may be. It can also be used to trigger the handlers, although I haven't exercised that ability. > Are there any available handlers for common > types, either as source or as an extension ? AppleScript installs a whole bunch more. And they are multiplying on the net. They can get pretty much as outrageous as you may want. I suppose a hypothetical typeRTF -> hypothetical typePostScript would be possible (but not written by me). -- John Baxter Port Ludlow, WA, USA [West shore, Puget Sound] jwbaxter@pt.olympus.net +++++++++++++++++++++++++++ >From paulr@syma.sussex.ac.uk (Paul Russell) Date: Thu, 3 Mar 1994 13:49:08 GMT Organization: University of Sussex John W. Baxter (jwbaxter@olympus.net) wrote: : In article <1994Mar2.163154.19747@syma.sussex.ac.uk>, : paulr@syma.sussex.ac.uk (Paul Russell) wrote: : > Am I right in thinking that there are no default : > system AE coercion handlers ? Not even for such : > basic conversions as typeChar<->typeExtended ? : There are a bunch of coercions built in to the Apple Event Manager. As it : happens, typeChar<->typeExtended is one of them. They are listed in Inside : Mac: IAC in table 4-1, which occupies ALL of pages 4-43 and 4-44. : There are more "exotic" ones built in, too, such as typeAppleEvent --> : typeAppParameters (by far the easiest way to build THAT monster). Thanks for the above and the rest of your comments - it looks like I have some sort of problem. I tried writing a small program which just calls AEGetCoercionHandler for a few different types and I get a -1717 for everything I've tried. I have Apple Event Manager 1.0.1 and AppleScript 1.0 installed and am running System 7.1. I think both Apple Event Manager 1.0.1 and AppleScript 1.0 may be out of date by now so I'll have a dig through the developer CD's and see if I can find something newer. //Paul -- | Paul Russell | Internet: P.T.Russell@sussex.ac.uk | | Experimental Psychology | AppleLink: EP.SUSSEX | | Sussex University, Falmer | Telephone: +44 273 678639 | | Brighton BN1 9QG, England | Facsimile: +44 273 678433 | +++++++++++++++++++++++++++ >From isis@netcom.com (Mike Cohen) Date: Thu, 3 Mar 1994 01:48:50 GMT Organization: ISIS International paulr@syma.sussex.ac.uk (Paul Russell) writes: >Am I right in thinking that there are no default >system AE coercion handlers ? Not even for such >basic conversions as typeChar<->typeExtended ? >Is there a utility for displaying the installed >coercion handlers ? >Are there any available handlers for common >types, either as source or as an extension ? >//Paul There are default handlers for most numeric types to/from text and from alias to FSSpec. With AppleScript installed, there are many more coercion handlers available. -- Mike Cohen - isis@netcom.com NewtonMail: MikeC49506 / ALink: D6734 / AOL: MikeC20 +++++++++++++++++++++++++++ >From paulr@syma.sussex.ac.uk (Paul Russell) Date: Fri, 4 Mar 1994 13:20:57 GMT Organization: University of Sussex Mike Cohen (isis@netcom.com) wrote: : paulr@syma.sussex.ac.uk (Paul Russell) writes: : >Am I right in thinking that there are no default : >system AE coercion handlers ? Not even for such : >basic conversions as typeChar<->typeExtended ? : >Is there a utility for displaying the installed : >coercion handlers ? : >Are there any available handlers for common : >types, either as source or as an extension ? : >//Paul : There are default handlers for most numeric types to/from text and from : alias to FSSpec. With AppleScript installed, there are many more coercion : handlers available. Thanks - I found the FKEY that displays the installed handlers and although there is a motley assortment of coercion handlers, none of the expected handlers for text<->float etc seem to be available. This appears to be the case on several Macs that I have tried this on, and I have also written a test program which calls AEGetCoercionHandler which returns -1717 for just about any pair of types I try. My best guess is that this is something to do with localisation - I am wondering if the US version of the Apple Event Manager/Apple Script checks the system version and doesn't install any handlers which might be country-specific ? If the above is not the correct explanation then I'd be interested to hear any other possible explanations for why the usual handlers aren't available ? //Paul -- | Paul Russell | Internet: P.T.Russell@sussex.ac.uk | | Experimental Psychology | AppleLink: EP.SUSSEX | | Sussex University, Falmer | Telephone: +44 273 678639 | | Brighton BN1 9QG, England | Facsimile: +44 273 678433 | +++++++++++++++++++++++++++ >From lai@apple.com (Ed Lai) Date: 4 Mar 1994 23:32:40 GMT Organization: Apple In article <1994Mar4.132057.18343@syma.sussex.ac.uk>, paulr@syma.sussex.ac.uk (Paul Russell) wrote: > Mike Cohen (isis@netcom.com) wrote: > : paulr@syma.sussex.ac.uk (Paul Russell) writes: > > : >Am I right in thinking that there are no default > : >system AE coercion handlers ? Not even for such > : >basic conversions as typeChar<->typeExtended ? > > : >Is there a utility for displaying the installed > : >coercion handlers ? > > : >Are there any available handlers for common > : >types, either as source or as an extension ? > > : >//Paul > > : There are default handlers for most numeric types to/from text and from > : alias to FSSpec. With AppleScript installed, there are many more coercion > : handlers available. > > Thanks - I found the FKEY that displays the installed handlers and > although there is a motley assortment of coercion handlers, none > of the expected handlers for text<->float etc seem to be available. > This appears to be the case on several Macs that I have tried this on, > and I have also written a test program which calls AEGetCoercionHandler > which returns -1717 for just about any pair of types I try. > > My best guess is that this is something to do with localisation - I am > wondering if the US version of the Apple Event Manager/Apple Script > checks the system version and doesn't install any handlers which might > be country-specific ? > > If the above is not the correct explanation then I'd be interested to > hear any other possible explanations for why the usual handlers aren't > available ? > > //Paul > > -- > | Paul Russell | Internet: P.T.Russell@sussex.ac.uk | > | Experimental Psychology | AppleLink: EP.SUSSEX | > | Sussex University, Falmer | Telephone: +44 273 678639 | > | Brighton BN1 9QG, England | Facsimile: +44 273 678433 | The FKEY displays the installed handlers, i.e. those handler installed using AEInstallXXX, but not the built-in handlers. The built-in handlers are listed in IM. The are built-in typeChar<->numericTypes. However they are not localized and are not meant for formating of numbers, they are more like what you expect to see in a debugger. -- /* Disclaimer: All statments and opinions expressed are my own */ /* Edmund K. Lai */ /* Apple Computer, MS303-3A */ /* 20525 Mariani Ave, */ /* Cupertino, CA 95014 */ /* (408)974-6272 */ zW@h9cOi +++++++++++++++++++++++++++ >From ldo@waikato.ac.nz (Lawrence D'Oliveiro, Waikato University) Date: 9 Mar 94 11:59:29 +1300 Organization: University of Waikato, Hamilton, New Zealand In article , isis@netcom.com (Mike Cohen) writes: > paulr@syma.sussex.ac.uk (Paul Russell) writes: > >>Am I right in thinking that there are no default >>system AE coercion handlers ? Not even for such >>basic conversions as typeChar<->typeExtended ? > > There are default handlers for most numeric types to/from text and from > alias to FSSpec. With AppleScript installed, there are many more coercion > handlers available. But one obvious one is missing: converting a pathname string to an alias or an FSSpec. I _always_ keep forgetting to prefix my pathnames with "file" or "alias"... Lawrence D'Oliveiro fone: +64-7-856-2889 Info & Tech Services Division fax: +64-7-838-4066 University of Waikato electric mail: ldo@waikato.ac.nz Hamilton, New Zealand 37^ 47' 26" S, 175^ 19' 7" E, GMT+13:00 +++++++++++++++++++++++++++ >From lai@apple.com (Ed Lai) Date: 10 Mar 1994 17:57:10 GMT Organization: Apple In article <1994Mar3.134908.28390@syma.sussex.ac.uk>, paulr@syma.sussex.ac.uk (Paul Russell) wrote: > John W. Baxter (jwbaxter@olympus.net) wrote: > : In article <1994Mar2.163154.19747@syma.sussex.ac.uk>, > : paulr@syma.sussex.ac.uk (Paul Russell) wrote: > > : > Am I right in thinking that there are no default > : > system AE coercion handlers ? Not even for such > : > basic conversions as typeChar<->typeExtended ? > > : There are a bunch of coercions built in to the Apple Event Manager. As it > : happens, typeChar<->typeExtended is one of them. They are listed in Inside > : Mac: IAC in table 4-1, which occupies ALL of pages 4-43 and 4-44. > > : There are more "exotic" ones built in, too, such as typeAppleEvent --> > : typeAppParameters (by far the easiest way to build THAT monster). > > Thanks for the above and the rest of your comments - it looks like I > have some sort of problem. I tried writing a small program which just > calls AEGetCoercionHandler for a few different types and I get a -1717 > for everything I've tried. I have Apple Event Manager 1.0.1 and > AppleScript 1.0 installed and am running System 7.1. I think both > Apple Event Manager 1.0.1 and AppleScript 1.0 may be out of date > by now so I'll have a dig through the developer CD's and see if > I can find something newer. > > //Paul > -- > | Paul Russell | Internet: P.T.Russell@sussex.ac.uk | > | Experimental Psychology | AppleLink: EP.SUSSEX | > | Sussex University, Falmer | Telephone: +44 273 678639 | > | Brighton BN1 9QG, England | Facsimile: +44 273 678433 | AEGetCoercionHandler just returns whether an XXXX->YYYY coercion handler has been installed, it does not really tell you if a particular coercion exists. Built-in handler does not have a fixed address since the PACK can be relocated so the address cannot be returned. And if XXXX->YYYY coercion is handled by the ****->YYYY coercion handler, AEM has no idea that XXXX->YYYY can or cannot be done through ****->YYYY. So the rule is that it strictly returns an XXXX->YYYY address if it exists, it is not meant to rule out the existence of the possibility of XXXX->YYYY coercion. -- /* Disclaimer: All statments and opinions expressed are my own */ /* Edmund K. Lai */ /* Apple Computer, MS303-3A */ /* 20525 Mariani Ave, */ /* Cupertino, CA 95014 */ /* (408)974-6272 */ zW@h9cOi +++++++++++++++++++++++++++ >From jonpugh@netcom.com (Jon Pugh) Date: Sat, 12 Mar 1994 07:19:44 GMT Organization: NETCOM On-line Communication Services (408 241-9760 guest) Lawrence D'Oliveiro, Waikato University (ldo@waikato.ac.nz) wrote: > But one obvious one is missing: converting a pathname string to an alias or > an FSSpec. I _always_ keep forgetting to prefix my pathnames with "file" or > "alias"... This is done and working as part of Jon's Commands 1.1. Interested in beta testing? Jon --------------------------- >From gjw2824@hertz.njit.edu (Greg Weston) Subject: ARGH!!! Whamwhamwham!! (DialogSelect stuff) Date: 5 Mar 94 19:58:17 GMT Organization: New Jersey Institute of Technology, Newark, New Jersey Howdy, folks. I've been playing with everyone's favorite UI addition: Floating Windows. I've gotten them to work smoothly and cleanly, and they interact fine with normal windows and dialogs (modal or not). The only problem is within a pair of cute little routines called IsDialogEvent and DialogSelect. They don't like having a (floating) window in front of the Dialog they're working with. So, I re-wrote them. (Cough. Embarrased grin.) Still no problem. I'm left with one teensy little problem: How in the blazes does DialogSelect do its magic when you have multiple editTexts in a dialog and either click in a non-current one or press Tab?!?! On that topic, IM is silent, and I can't figure out how to successfully pull off the swap with what they give you. Any thoughts, suggestions, or even polite chuckles would be appreciated. Thanks, Greg +++++++++++++++++++++++++++ >From cconstan@epdiv1.env.gov.bc.ca (Carl B. Constantine) Date: Mon, 07 Mar 1994 10:31:52 -0800 Organization: Ministry of Environment, Lands & Parks In article <1994Mar5.195817.28298@njitgw.njit.edu>, gjw2824@hertz.njit.edu (Greg Weston) wrote: > Howdy, folks. I've been playing with everyone's favorite UI addition: > Floating Windows. I've gotten them to work smoothly and cleanly, and > they interact fine with normal windows and dialogs (modal or not). The > only problem is within a pair of cute little routines called > IsDialogEvent and DialogSelect. They don't like having a (floating) > window in front of the Dialog they're working with. > > So, I re-wrote them. (Cough. Embarrased grin.) Still no problem. I'm > left with one teensy little problem: > How in the blazes does DialogSelect do its magic when you have > multiple editTexts in a dialog and either click in a non-current one > or press Tab?!?! > > On that topic, IM is silent, and I can't figure out how to > successfully pull off the swap with what they give you. Any thoughts, > suggestions, or even polite chuckles would be appreciated. > Thanks, > Greg One solution the new IM suggests instead of using dialog Select, you can use a custom routing to take a look at what kind of window the event occured in and then process the event that way. Very usefull if you're dealing with modeless and movableModal Dialogs. Source: IM Macintosh Toolbox Essentials, ch. 6 Dialog Manager. -- ========================================================================= Carl B. Constantine B.C. Environment, Lands & Parks End-User Support Analyst CCONSTAN@epdiv1.env.gov.bc.ca +++++++++++++++++++++++++++ >From Steve Bryan Date: Tue, 8 Mar 1994 15:37:06 GMT Organization: Sexton Software In article <1994Mar5.195817.28298@njitgw.njit.edu> Greg Weston, gjw2824@hertz.njit.edu writes: >How in the blazes does DialogSelect do its magic when you have >multiple editTexts in a dialog and either click in a non-current one >or press Tab?!?! I can't tell exactly how far along you are in this project but if you haven't taken a look at the DialogRecord structure you should do so now (I know, you probably have). DialogRecord = record window: WindowRecord; items: Handle; textH: TEHandle; editField: Integer; editOpen: Integer; aDefItem: Integer; end; You need to manipulate the textH and editField variables. EditField points to the current text item in the linked list starting at items. Of course you have to update the current text item before setting up textH for the new text item. I thought there was some fairly useful information about this stuff in Inside Mac Volume I. My volume I is at home so I can't check but try looking there. +++++++++++++++++++++++++++ >From u9119523@sys.uea.ac.uk (Graham Cox) Date: Wed, 9 Mar 1994 17:01:47 GMT Organization: School of Information Systems, UEA, Norwich In article , cconstan@epdiv1.env.gov.bc.ca (Carl B. Constantine) wrote: > In article <1994Mar5.195817.28298@njitgw.njit.edu>, gjw2824@hertz.njit.edu > (Greg Weston) wrote: > > > Howdy, folks. I've been playing with everyone's favorite UI addition: > > Floating Windows. I've gotten them to work smoothly and cleanly, and > > they interact fine with normal windows and dialogs (modal or not). The > > only problem is within a pair of cute little routines called > > IsDialogEvent and DialogSelect. They don't like having a (floating) > > window in front of the Dialog they're working with. > > > > So, I re-wrote them. (Cough. Embarrased grin.) Still no problem. I'm > > left with one teensy little problem: > > How in the blazes does DialogSelect do its magic when you have > > multiple editTexts in a dialog and either click in a non-current one > > or press Tab?!?! > > [SNIP!] If your question is how does it switch from one field to another, then I can answer! In the DIALOGRECORD is a field which contains the ID number of the current editable item. When you hit tab, this is incremented and GetDItem called to see if the resulting item is an edit field, if not, it increments until it finds one or until the last item is found, at which point it starts over from item 1. When it finds an edit field item, it retrieves the text from the current one and stashes it into the item list as that item's string, then installs the text from the new one into the teRecord using TESetText, then selects it with TESetSelect, or if it was a click, calls TEClick. The DialogRecord also contains the teHandle. The DialogRecord itself is documented in IM, though this sequence of events isn't- I had to figure this out for myself once when trying to do something along the same lines as you. You can get the dialog record by casting the DialogPtr to type DialogPeek. Hope this helps! > ========================================================================= > Carl B. Constantine B.C. Environment, Lands & Parks > End-User Support Analyst CCONSTAN@epdiv1.env.gov.bc.ca - ------------------------------------------------------------------------ Love & BSWK, Graham -Everyone is entitled to their opinion, no matter how wrong they may be... - ------------------------------------------------------------------------ - ------------------------------------------------------------------------ Love & BSWK, Graham -Everyone is entitled to their opinion, no matter how wrong they may be... - ------------------------------------------------------------------------ +++++++++++++++++++++++++++ >From qsi@NU91.wlink.nl (Peter Kocourek) Date: Tue, 08 Mar 1994 20:55:13 +0100 Organization: (none) Greg Weston wrote in a message on 05 Mar 94 to All GW> Howdy, folks. I've been playing with everyone's favorite UI GW> addition: Floating Windows. I've gotten them to work smoothly GW> and cleanly, and they interact fine with normal windows and GW> dialogs (modal or not). The only problem is within a pair of GW> cute little routines called IsDialogEvent and DialogSelect. GW> They don't like having a (floating) window in front of the GW> Dialog they're working with. GW> So, I re-wrote them. (Cough. Embarrased grin.) Still no problem. I'm GW> left with one teensy little problem: How in the blazes does GW> DialogSelect do its magic when you have multiple editTexts in a GW> dialog and either click in a non-current one or press Tab?!?! These two issues are separate from one another. Handling Tab presses is far easier than getting the mouseclick-induced editText change right. In fact, I haven't been able to get my own routines to do this properly. I was writing my own replacements for DialogSelect (to handle movable modals and modeless dialogs, with enhancements) and bumped into this problem. I tried all sorts of things with the TERecord, but I wasn't able to get the swap done cleanly. So I cheated. :-) In my generic mouseDown-handling procedure (within my DialogSelect replacement), I first check to see where the mouseDown occurred. If it is in an editText field, I check whether it's the currently active text-input-capable [TIC, my shorthand] item, and in that case a simple TEClick call will take care of everything. If it is not the active TIC item (you have to keep track of this separately), and the currently active TIC is not an editText item, I call the deactivate function for the TIC; this is usually a custom routine for userItems which contain lists, for instance. One aesthetic problem with this is, that userItems don't have refCons, so that storing the ProcPtr's for userItem service routines is a bit cumbersome. Anyway, once your previously active TIC item is deactivated, you can activate manually the TERecord, and call TEClick. If, however, the active TIC item is an editText item, you'll have to "cheat". Having found no way doing the transition from one editText item to another cleanly, I simply call the real DialogSelect at that point. Note that I only do this when I have determined unambiguously, that the action to be taken is switching from one editText item to anohter. As for handling Tabs, the situation is a bit easier. You have to find the next TIC item in your dialog (or the previous one, if the user pressed shift-Tab), and handle the deactivating and activating as above. The one thing that makes it doable without "cheating", is that you don't have to place the caret anywhere within the text of the next editText item (this was causing my problems), but you either select the entire text, if there is any, or you place just a caret, if there isn't. I'll include my source code where I do this. Parts of the code are specific to my implementation. For instance, I store a struct via a Handle in the window refCon. This struct contains lots of information about the dialog; you'll have to adapt it to your own needs. You will have to provide the CanAcceptText and ActivateItem functions for your own userItems. (sorry about the formatting) /************************************************************************************* * Somehow I'm not sure I should be doing this... requires too much thinking. * CycleKeyBoardInput is a generic routine that will find and activate the next (or * previous) item in the DITL that can accept keyDown events. To do this, it needs: * + a DialogPtr, to indicate in which dialog to look, and to get at the WIHandle. * + a Boolean isShiftPressed, to determine whether to search forward or backward in * the DITL * + a pointer to a function, that returns a Boolean. If this function, CanAcceptText, * returns TRUE, then the userItem (passed as a short to CanAcceptText) can accept * keyDown events. This function should be declared along with the other specific * functions for this dialog (as in AddressesDialog.c) * + another pointer to a function, that will either activate or deactivate a userItem * in the DITL, that can accept text. {De}Activating editText items is done here. * ************************************************************************************/ void CycleKeyboardInput(DialogPtr dPtr, Boolean isShiftPressed, Boolean (*CanAcceptText)(short), void (*ActivateItem)(WindowInfoHandle, short, Boolean)) { WindowInfoHandle aWIHandle; /* my own struct with info */ short numItems, activeItem, queryItem, iType; aWIHandle = (WindowInfoHandle)GetWRefCon(dPtr); activeItem = (**aWIHandle).activeItem; /* keeping track of the active TIC */ queryItem = activeItem + (isShiftPressed ? -1 : 1); numItems = CountDITL(dPtr); while (queryItem != activeItem) /* check to see if we're back where we started */ { if (queryItem == 0) /* handle rollover */ queryItem = numItems; else if (queryItem == numItems+1) queryItem = 1; GetDItem(dPtr, queryItem, &iType, &workHandle, &workRect); /* get item info */ if (iType == editText) /* if editText, we're finished */ break; if (iType == userItem && CanAcceptText(queryItem)) /* same for userItem */ break; isShiftPressed ? queryItem-- : queryItem++; /* get next item to query */ } if (queryItem != activeItem) /* found a new one? */ { short aType; GetDItem(dPtr, activeItem, &aType, &workHandle, &workRect); if (aType == userItem) ActivateItem(aWIHandle, activeItem, FALSE); /* deactivate currently active user item */ if (iType == editText) { SelIText(dPtr, queryItem, 0, 32767); /* select new editText item */ workTEHandle = ((DialogPeek)dPtr)->textH; } else if (iType == userItem) { if (aType == editText) /* was previously active item an editText? */ TEDeactivate(((DialogPeek)dPtr)->textH); ActivateItem(aWIHandle, queryItem, TRUE); /* select new userItem */ workTEHandle = NIL; } (**aWIHandle).activeItem = queryItem; UpdateEditMenus(workTEHandle, kSystemTE); /* my own service proc */ } } I hope you can make some sense out of all this. :-) Additional notes: CountDITL is System 7 specific (may come with CTB under System 6), but I don't do System 6 anymore. An example of an ActivateItem function would be for a list to draw a border around the list, to alert the user that keypresses will go to the list (to select a cell in the list). GW> On that topic, IM is silent, and I can't figure out how to GW> successfully pull off the swap with what they give you. Any GW> thoughts, suggestions, or even polite chuckles would be appreciated. :-) YHS:QSI! +++++++++++++++++++++++++++ >From gjw2824@hertz.njit.edu (Greg Weston) Date: 10 Mar 94 20:38:09 GMT Organization: New Jersey Institute of Technology, Newark, New Jersey Well, I had three people respond, each with different suggestions, and got it working with the first one. I'd like to thank Steve Bryan, Carl Constantine, and Simon Ward for their advice. I had looked through the IM vol 1 stuff pretty carefully, but the mechanics of the manipulation really were sketchy. My NIM is 100+ miles away, so I didn't have that to look through. I have a solution that works quite well, though, and I'm very happy with the finished product. Thank you kindly, and y'all will probably see a submission to the standard archives soon. Greg --------------------------- >From ejohnson@netcom.com (Eric Johnson) Subject: An offering: Assembly language code for a high speed copybits Date: Fri, 18 Mar 1994 07:50:48 GMT Organization: NETCOM On-line Communication Services (408 241-9760 guest) About two weeks ago, I had mentioned that I had written some assembly language code that could beat copybits given certain assumptions. A number of people (Alex Metcalf and David Wareing) had asked me to send them some samples. I was going to write up some demo code with it, but never got around to it due to work obligations. A few days ago, I sent David an email message with some and a length explanation. I figured that maybe everyone else could benefit from the email to. Keep in mind that the following code was my first real whack at some high speed copybits. I may be breaking a few rules, or doing bad Mac programmer things. So USE IT AT YOUR OWN RISK. I believe further enhancements can be made to it, especially given the wide spread prominence of the 030 and 040. When I first wrote it, the 020 was in much more common use. This code has evolved into a copybits that will do some very fast masking. I had used it for a graphics engine that would display things at a perspective. Where each icon was a perspective block, if you will. Thus masking was needed. It looks quite sharp. I can go over that code too, as well as the trick I put in for some fast masking. As always, you pay for the speed. It needs a bit of memory. Let me know if this helps anyone. Here is the message I sent to David Wareing. <-------------> David, Okay, here's some code with some explanation off the top of my head. This code was developed for a tile based adventure game that was never quite finished. The graphics were a simple display consisting of 11x11 icons where each icon was a 24x24 cicn resource. This type of display is identical to that of the old Ultima games on the Apple ][ and old PC's. And the more recent Zelda of Nintendo or Civilization on the Macintosh. I initially wrote everything in C, and used PlotCIcon to place my color icons on the screen. In found that to be too slow, because the Mac was constantly converting the colors in the resource to those available on the screen. Hence the molases like quality. So I said to myself, "Screw it, I'm going do it in assembly.". Even though the code I'm including has been rewritten for a slightly more complex display, I'm giving this to you so you can follow the history of its development. It should make more sense this way, and you'll see some room for improvements along the way. I wrote this as a beginner Mac programmer, so I may be breaking some rules. Let's first identify what slows down drawing to the screen. In my case, I had a few mistakes. The first one was using PlotCIcon. The Mac was constantly converting the resource into the current colors available in the port. That's all well and good, in fact that's quite nice of them. But it slows things down a bit. To get around this, I created an off screen window that would warehouse my icons. The benefit is that the off screen window would have the same color table [CLUT, right?] so any transfer between an offscreen window and the main screen would be a trivial copy. Thus, at start time, I painted each icon into the off screen to "coerce" it into the current system table. Okay, so I did that. I would then use copy bits to repaint my window. In my game, I have a two dimensional array with numbers indicating which icon goes at what location. So, I would get the id number of the display icon, locate it in my off screen window [the palette of icons] and use copybits to get it to the window. Well, that was better, but was still slower than I expected. So, I got to thinking about how to speed things up. The bottleneck in this case is copybits. Sure, copybits is fairly fast, but I know a few things about my icons that it doesn't know. They are all of the same width, and same height. But how do I put this knowledge to good use? Let's step back for a second and look at the Motorola architecture. When I wrote it, I decided to take advantage of the features found in the 68020's and later. It really starts to do well in the 030. We won't consider the 68000 because there's not much you can do for that chip. The 680x0 has 8 32 bit data registers and 8 32 address registers. You lose some of the address registers to the system, but all in all, there's some room to play around in. The 68020 and later have some code cache and data caches to work with too. In other words, you can have a loop fill up the instruction cache, and then the CPU can run faster because all of its code is in the cache. It's this instruction cacheing that I take full advatange of. In later developments of this code, I try to take advantage of the data cache. So, we need to write some assembly language code that will copy 24 bytes at a time from 24 different parts of memory. Remember, in this case, my icons are 24 bytes wide (24 pixels at 8 bits a pixel). And the are 24 pixels high. So, we've got 24 blocks of 24 pixels. Each block starts at a slightly different memory location. My code loops 24 times, one for each row of 24 bytes. Thus, the code that copies each row lands in the instruction cache. So, on each subsequent reitiration of the code, we are running strictly from the cache. This gives us some good performance. But there's one more feature too. Most modern processors have a pipeline where the different parts of the CPU execute part of an instruction then hand it off to the next step. It works identically to an assembly line where someone installs the engine and the next person installs the tires. A new car always rolls off the line every so often even though it may take a bit for a car to travel the entire line. The catch lies with branch instructions. The CPU won't know if the branch should be taken until its evaluation is complete. But, this means that the CPU won't necessarily be putting the right instructions in the pipeline. It would be like producing a car that indicated the previous cars should be destroyed. This wrecks efficiency. To get around it, Motorola has provided an instruction, dbra, that serves as a hint. dbra tells the CPU to branch if the contents of the data register are not zero. It instructs the CPU to expect the branch and fill the pipeline with the instructions that would result if a branch takes place. So, my loops gets two nice features going. The instruction cache and it keeps the pipeline going too. Pretty sweet, eh? My code works in two steps. The first section is the prepatory work for the loop. I put as much stuff as I can into registers, because adding and multiplying "in register" is much faster than from memory. And the instructions are shorter, which means less space is taken up in the instruction cache. This point of putting stuff "in register" may seem anal, but remember, we need to add some values at the end of each iteration of the loop and keeping stuff as fast and small is good. Now, let's go over the parameters to the code. mySource points to the start of the off screen palette. myDestination points to the start of the off screen drawing space. xSource and ySource are the x,y coordinates of the pixel that represents the upper left hand corner of the thing you wish copied. Keep in mind that these are *PIXEL COORIDINATES* not icon coordinates. Its up to you to find the start of the icon you wish copied in your off screen palette of icons. I suppose my code could do it. I just didn't bother. xDestination and yDestination are the same xSource and ySource except for the destination array. iconSize should be your icon height minues one. In my case, its 23. int MyCopyBits8(Ptr mySource, Ptr myDestination, long int xSource, long int ySource, long int xDestination, long int yDestination, long int iconSize) { asm 68000 { /** ** We need to save the registers that we are going to clobber. ** ** a1 starts out pointing to the top of the pallete, but it . ** will eventually end up pointing to the start of the source ** icon. Same goes for a2. ** ** d2 and d3 are a bit tricky to explain. They are the row ** byte values for the source and destination. A row byte ** value is the number of bytes required to jump down to the ** next row. It needs to be a multiple of four else some ** Mac internals complain. We use these values to find the ** real location of both icons. ** **/ movem.l a1-a2/d0-d3, -(sp); /* save the regs */ move.l mySource,a1; move.l myDestination,a2; move.l #0x0780,d2; /* hard coded row bytes value */ move.l #0x0138,d3; /* hard coded row bytes value */ /** ** The following four lines find the real address of the source ** icon. They do this by following a simple formula. ** real_source = base + y Position * row bytes for Src + x Pos ** This result is placed into a1 as mentioned before. **/ move.l ySource,d0; mulu d2,d0; add.l d0,a1; add.l xSource,a1; /** We follow the same formula for the destination address **/ move.l yDestination,d0; mulu d3,d0; add.l d0,a2; add.l xDestination,a2; /** ** The following will seem a bit weird. Why subtract #20 from the ** row byte values? Keep in mind that as we are copying from the ** source to the destination, we are changing our pointers ** (marching them across the icon). When we are finished copying, ** we need to add the rowBytes-20 to get to the first byte of the ** the next row. Note that 20 is iconSize-3. Yeah, that should be ** that way in the code. Just never bothered to change it. **/ sub.l #20,d2; sub.l #20,d3; move.l iconSize,d0; /** END OF PREPARTORY WORK **/ /** And now you're ready to start the copy of each row of bytes **/ @1 ; move.l (a1)+,(a2)+; /** Note that we are increasing our **/ move.l (a1)+,(a2)+; /** pointers as go along here. **/ move.l (a1)+,(a2)+; /** Also note that we copy four **/ move.l (a1)+,(a2)+; /** bytes at a crack. And we do it **/ move.l (a1)+,(a2)+; /** six times. For 24 bytes! **/ move.l (a1),(a2); add.l d2,a1; /** We need to jump down to the start of **/ add.l d3,a2; /** first pixel in the next row. **/ dbra d0,@1; /** Branch until we are done **/ movem.l (sp)+, a1-a2/d0-d3 /* restore the registers */ } return(1); } -- Eric E Johnson ejohnson@netcom.netcom.com --------------------------- >From alex@metcalf.demon.co.uk (Alex Metcalf) Subject: Animation speed: here we go again... Date: Sat, 12 Mar 1994 10:14:53 GMT Organization: Demon Internet Now that we've exhausted the previous "animation speed" thread, it's time to start another. :-) Having got my game code to run (what I considered to be) extremely fast on my LC475, I thought I'd give it a whirl on our IIsi. Oh no! Extremely slow animation speed. I know that the IIsi has very slow video, but what was running at 60fps on an LC475 surely wouldn't be reduced to less than 10fps on a IIsi. One of the things I thought might be causing the problem was that I still might be having problems matching colour tables between the GWorld and the window on-screen. I'm creating a normal colour window, and making it the size of the screen (0,0,640,480). I'm not changing its palette in any way. Then I'm using NewGWorld with a pixel depth of 0, which is meant to optimise CopyBits calls with the screen. It's also meant to use the colour table info and screen depth of the deepest monitor intercepting the given rectangle. Since I've only got one monitor, this shouldn't be the problem. However, response still seems to be unreasonbly sluggish on the IIsi: whether its in gray scale or colour, the speed is disappointing. I believe Andrew Welch (hope I spelt your name right) reads this area regularly: for Maelstrom, what is the animation speed like on low end '030 Macs? I know you use heavily optimised assembler for your animation, but I don't think that what I'm doing (CopyBits) should case such a dramatic difference in animation speed. Is there any way to make CopyBits completely ignore the colour table differences? Interesting ideas and suggestions are always appreciated. Alex -- Alex Metcalf, Mac programmer in C, C++, HyperTalk, assembler Internet, AOL, BIX: alex@metcalf.demon.co.uk AppleLink: alex@metcalf.demon.co.uk@internet# CompuServe: INTERNET:alex@metcalf.demon.co.uk Delphi: alex@metcalf.demon.co.uk@inet# FirstClass: alex@metcalf.demon.co.uk,Internet Fax (UK): (0570) 45636 Fax (US / Canada): 011 44 570 45636 +++++++++++++++++++++++++++ >From Arsenault_C@msm.cdx.mot.com (Chris Arsenault) Date: Tue, 15 Mar 1994 12:29:41 -0500 Organization: Motorola Codex In article , alex@metcalf.demon.co.uk (Alex Metcalf) wrote: > However, response still seems to be unreasonbly sluggish on the > IIsi: whether its in gray scale or colour, the speed is disappointing. It sounds like you're okay with your color tables and CopyBits. You might be running into a hardware problem. There is no separate VRAM for video on the IIsi (or IIci). The IIsi has 1 MB on the motherboard, a portion of which it uses as video RAM. Not that I've investigated this, but I remember reading that if the disk cache is boosted to occupy the majority of motherboard RAM, then the video driver uses the SIMMs instead and because the SIMM RAM doesn't have to deal with a bank switch wait you can get approx. a 30% speed increase. The unfortunate part about this is that it's not really software controllable - it's sort of up to the user. Chris -- #include --------------------------- >From alex@metcalf.demon.co.uk (Alex Metcalf) Subject: Animation speed: improvement! Date: Sun, 6 Mar 1994 23:16:59 GMT Organization: Demon Internet I discovered something quite interesting this afternoon, which has almost doubled the speed of the sprite animation in my game. Just a quick recap: my game copies the background (behind the sprites) to a gworld, copies the sprites the same gworld, and then copies that rectangle to the screen. In all, I was using 5 CopyBits for the background, 5 CopyMasks for the sprites, and 5 CopyBits for copying to the screen. Someone originally suggested that I combined all the rectangles into a region and do a single CopyBits call. However, it turned out to be slower! Go figure. I guess the individual CopyBits calls outstrip the RectRgn and UnionRgn calls. This afternoon, I thought I'd give it another shot, in case I'd missed something (or done a goofy error which was slowing things down). This time, for no particular reason, I chose to combine the rectangles for my CopyBits calls to the screen, and only do the single CopyBits call. I gave it a go and... WHOAH! Unbelievable speed increase (almost 200%, 60 fps). It seems that CopyBits calls have much more overhead when copying to the screen rather than copying between offscreen gworlds. I guess this is because it checks screen depth, colour tables, etc. etc. To give you an idea of the speed increase: with the extra "time" I had in my game loop, I was able to add another 4 sprites to the screen, resulting in another 4 CopyBits calls and 4 CopyMask calls. Even then, it was still faster than when I was using individual CopyBits calls to the screen! So, I've found that the fastest way (with normal QuickDraw routines) to make my animation work is to use individual CopyBits calls for sprites between GWorlds, and then a single CopyBits call when it's all ready to come to the screen. Here's my code snippet for the copy-to-screen, where gWorldRect is 0,0,640,480 (full screen). I guess I don't need the second SetEmptyRgn call. (I use "t" to denote local variables). // ------ SetEmptyRgn (tCopyRgn); SetEmptyRgn (tRectRgn); tObject = gFirstObject; while (tObject != nil) { if (!tObject->fVisible) { tObject = (GameObject) tObject->fNextObject; continue; } RectRgn (tRectRgn, &tObject->fAnimEnclosureRect); UnionRgn (tRectRgn, tCopyRgn, tCopyRgn); tObject = (GameObject) tObject->fNextObject; } CopyBits ((BitMap *) *tPixMap[3], (BitMap *) &gGameWindow->portPixMap, &gWorldRect, &gWorldRect, srcCopy, tCopyRgn); // ------ On a slightly different topic: thanks to some code by Francis (Francis H Schiffer 3rd), I was able to test my game loop to see where the time was being used up. I knew that the copying of graphics takes up quite a lot of time, but I'd never imagined that it tool 98% of the time! Needless to say, that is what inspired me to have another go at improving the CopyBits speed... Thanks again to all those who have given suggestions and code snippets.... they've all been very useful (or at least, interesting!). Alex -- Alex Metcalf, Mac programmer in C, C++, HyperTalk, assembler Internet, AOL, BIX: alex@metcalf.demon.co.uk AppleLink: alex@metcalf.demon.co.uk@internet# CompuServe: INTERNET:alex@metcalf.demon.co.uk Delphi: alex@metcalf.demon.co.uk@inet# FirstClass: alex@metcalf.demon.co.uk,Internet Fax (UK): (0570) 45636 Fax (US / Canada): 011 44 570 45636 +++++++++++++++++++++++++++ >From u9119523@sys.uea.ac.uk (Graham Cox) Date: Mon, 7 Mar 1994 11:23:35 GMT Organization: School of Information Systems, UEA, Norwich In article , alex@metcalf.demon.co.uk (Alex Metcalf) wrote: > > I discovered something quite interesting this afternoon, which has > almost doubled the speed of the sprite animation in my game. > [SNIP!] I also read somewhere that CopyBits with a mask region parameter is faster than CopyMask- you might want to try this and see if it's true. - ------------------------------------------------------------------------ Love & BSWK, Graham -Everyone is entitled to their opinion, no matter how wrong they may be... - ------------------------------------------------------------------------ +++++++++++++++++++++++++++ >From Tony Myles Date: 7 Mar 1994 21:45:34 GMT Organization: The 3DO Company In article Alex Metcalf, alex@metcalf.demon.co.uk writes: [stuff deleted] > So, I've found that the fastest way (with normal QuickDraw >routines) to make my animation work is to use individual CopyBits calls for >sprites between GWorlds, and then a single CopyBits call when it's all >ready to come to the screen. [stuff deleted] Hey, thats cool. Hmm, just out of curiosity, what kind of Mac are you running this on? I think I tried this a long time ago on a Q800, and it was still slower than individual CopyBits calls to the screen. I'll have to try it again though, I can't remember if I did it quite the way you describe. ...Tony - --------------------------------------------- Tony Myles work: tony.myles@3do.com The 3DO Company home: suiryu@aol.com +++++++++++++++++++++++++++ >From alex@metcalf.demon.co.uk (Alex Metcalf) Date: Tue, 8 Mar 1994 13:38:19 GMT Organization: Demon Internet In article <2lg79u$lp8@mac_serv.3do.COM>, Tony Myles wrote: > In article Alex Metcalf, > alex@metcalf.demon.co.uk writes: > [stuff deleted] > > So, I've found that the fastest way (with normal QuickDraw > >routines) to make my animation work is to use individual CopyBits calls > for > >sprites between GWorlds, and then a single CopyBits call when it's all > >ready to come to the screen. > [stuff deleted] > > > Hey, thats cool. Hmm, just out of curiosity, what kind of Mac are you > running this on? > I'm running this on an LC475. I believe I've got all the colour tables matched up correctly, and the source and destination rectangles are the same (and the bit depths). Alex -- Alex Metcalf, Mac programmer in C, C++, HyperTalk, assembler Internet, AOL, BIX: alex@metcalf.demon.co.uk AppleLink: alex@metcalf.demon.co.uk@internet# CompuServe: INTERNET:alex@metcalf.demon.co.uk Delphi: alex@metcalf.demon.co.uk@inet# FirstClass: alex@metcalf.demon.co.uk,Internet Fax (UK): (0570) 45636 Fax (US / Canada): 011 44 570 45636 --------------------------- >From alex@metcalf.demon.co.uk (Alex Metcalf) Subject: Animation speed: more info Date: Thu, 3 Mar 1994 00:25:29 GMT Organization: Demon Internet Thanks to those who sent replies to me about improving the animation speed in my game. I thought I'd be a little more specific in this description about exactly what I'm doing, and (as always) I appreciate any feedback. I have an arcade game which I would like to run at 30 fps on a 68030 Mac or better. It currently DOES run at 30 fps on my LC475, but only just, and since that's on a 68LC040, I don't stand much chance of 30 fps on an 030! In my game, there are 5 or 6 sprites always on the screen, each of them 24 x 24 pixels in size. There are a number of calculations that I do with them each time through my 2 tick "loop", but I'm assuming that I can get the most speed increase by improving my animation code. I have four (yeah, four) offscreen gworlds, three of them in 8 bit and 1 in 1 bit. I'll call the 1 bit one the "mask world", and the other ones worlds "one", "two", and "three". In world one, I have all my sprite animations, placed there from a PICT resource. In the mask world, I have the masks for the sprites, all with exactly the same rectangles as the ones in world one. In world two, I have the background. I'm using world 3 as the destination world for preparing to copy to the screen window. Every time through my loop, I first copy all the background rects to world three, each one covering the previous location and the next location of a sprite. This is done with a CopyBits call between worlds. Then, I use CopyMask to copy the sprites (from world one and the mask world) on to the background (world three). Finally, I copy each of the background rects onto the screen, again using CopyBits. So in summary: I make 6 CopyBits calls between worlds, 6 CopyMask calls between worlds, and 6 CopyBits calls from the world to the screen. The rectangles being copied are no more than 32 x 32. How can I speed this up? I know that assembly programming would be useful here, but hacking up an assembler copy between gworlds is a new project to me. I would like to get the animation up to a speed where I can do 30 fps on a 20mhz 68030, with slow screen redraw (a.k.a. our Mac IIsi). Thanks in advance for any help you can give me. Alex -- Alex Metcalf, Mac programmer in C, C++, HyperTalk, assembler Internet, AOL, BIX: alex@metcalf.demon.co.uk AppleLink: alex@metcalf.demon.co.uk@internet# CompuServe: INTERNET:alex@metcalf.demon.co.uk Delphi: alex@metcalf.demon.co.uk@inet# FirstClass: alex@metcalf.demon.co.uk,Internet Fax (UK): (0570) 45636 Fax (US / Canada): 011 44 570 45636 --------------------------- >From alex@metcalf.demon.co.uk (Alex Metcalf) Subject: Animation: the story continues... Date: Sun, 6 Mar 1994 00:45:45 GMT Organization: Demon Internet Just a quick update on my original question about speeding up animation in my game code. Thanks to all those who gave suggestions for speeding things up: I'm sorry I haven't had a chance to reply to each of you individually, but I've had a huge amount of email (about 50 messages a day) and I'm finding it hard to keep up. Along side this game, I'm working on about 5 or 6 different HyperCard external projects, and that mail (together with regular newsletters and listserv discussions) makes life quite busy! Anyway, here are a few of the suggestions I've tried: o A suggestion was made that rather than CopyBits each of the sprites from one world to another, I should combine them into a single region and make only one CopyBits call. Here's the way I was doing it before: ... tObject = gFirstObject; while (tObject != nil) { if (!tObject->fVisible) { tObject = (AppObject) tObject->fNextObject; continue; } CopyBits ((BitMap *) *tPixMap[2], (BitMap *) *tPixMap[3], &tObject->fAnimEnclosureRect, &tObject->fAnimEnclosureRect, srcCopy, nil); tObject = (AppObject) tObject->fNextObject; } ... And here's the way I tried doing it, using only a single CopyBits call. gWorldRect is the enclosing rectangle for the gworld. ... SetEmptyRgn (gCopyRgn); tObject = gFirstObject; while (tObject != nil) { if (!tObject->fVisible) { tObject = (AppObject) tObject->fNextObject; continue; } RectRgn (gRectRgn, &tObject->fAnimEnclosureRect); UnionRgn (gCopyRgn, gRectRgn, gCopyRgn); tObject = (AppObject) tObject->fNextObject; } CopyBits ((BitMap *) *tPixMap[2], (BitMap *) *tPixMap[3], &gWorldRect, &gWorldRect, srcCopy, gCopyRgn); ... The second section of code being slower than the first! o Someone else had suggested that rather than do a CopyMask call, I could do a CopyBits call with a region (apparently being 60% faster). However, while the mask for CopyMask masks out the source, the region for CopyBits masks out the destination. Therefore, unless I can change the position of a region each time I copy a sprite to the screen, I'm not sure the region param in CopyBits will help. Thanks again for all those who have helped out: further suggestions are always welcomed. I've learned a whole lot more about CopyBits and CopyMask now! Thanks, Alex -- Alex Metcalf, Mac programmer in C, C++, HyperTalk, assembler Internet, AOL, BIX: alex@metcalf.demon.co.uk AppleLink: alex@metcalf.demon.co.uk@internet# CompuServe: INTERNET:alex@metcalf.demon.co.uk Delphi: alex@metcalf.demon.co.uk@inet# FirstClass: alex@metcalf.demon.co.uk,Internet Fax (UK): (0570) 45636 Fax (US / Canada): 011 44 570 45636 --------------------------- >From mprince@mail.trincoll.edu (Matthew Prince) Subject: Blank Screen? Date: Wed, 2 Mar 1994 16:52:07 GMT Organization: Trinity College I'm curious what exactly I need to do to blank the entire screen. When I try to create a window that is the entire size of the GrayRgn I am able to cover up everything but the menu bar. Is there then a hideMenuBar command or something? Also, when I PaintRect the area defined by the GrayRgn to black a strip about the width of and right below the menu bar is left white. Any help would be appreciated. Matthew Prince mprince@mail.trincoll.edu +++++++++++++++++++++++++++ >From kenlong@netcom.com (Ken Long) Date: Thu, 3 Mar 1994 03:53:58 GMT Organization: NETCOM On-line Communication Services (408 241-9760 guest) Yes. You hide your menuBar before you show your window. That way the window is not 20 pixels down with the user's desktop textuer peeking over the top. The "HideMenuBar" and "ShowMenuBar routines in NewShuttle 1.0d3 are about as common as they come. I consulted 3 or 4 other working menu bar hide/show sources before finally deciding on that one. That Shuttle source is kind of a cheapo. It uses a "full screet window" as long as you use a 512 x 384 monitor. Unless your window is based on screenBits.bounds, as apple points out, such specific sized main windows are user unfriendly. But you also have to get fancy about positioning and/or sizing your window contents if you are going to make users with monitors over 14" happy, too. But scaling your screen objects is not very viable - it usually makes them look odd. You could do a monitor check and if it's larger than 14", don't hide the MBar and center the window. Another trick, as in "Out of This World" is to have all black outside the window. you could hide MBar, fill the rgn with black, and center your window within it. That looks okay regardless of window size (wintin reason). It that particular game, a click outside the action window updates the desktop, washing away the black. A click back in the action window, blacks it out, again. Nicely done. There are probably more 14" monitors than any other. The 12"s were popular when they first came out, but I wish I didn't get one. Some people, if the program is thoroughly done, will make sets of sizes of program parts for different monitors. The program would have to be worth it, and it would be done to increase the purchaser base potential. Here's some hide/show MBar, in C: RgnHandle mBarRgn; // First we need to "get a holt of" the MBar. short *mBarHeightPtr; short oldMBarHeight; void HideMenuBar (void) { Rect mBarRect; GrayRgn = GetGrayRgn (); mBarHeightPtr = (short *) 0x0BAA; oldMBarHeight = *mBarHeightPtr; *mBarHeightPtr = 0; mBarRect = screenBits.bounds; mBarRect.bottom = mBarRect.top + oldMBarHeight; mBarRgn = NewRgn (); RectRgn (mBarRgn, &mBarRect); UnionRgn (GrayRgn, mBarRgn, GrayRgn); PaintOne (0L, mBarRgn); } void ShowMenuBar (void) { *mBarHeightPtr = oldMBarHeight; DiffRgn (GrayRgn, mBarRgn, GrayRgn); DisposeRgn (mBarRgn); } And here's how they are called: main (void) { Do_Init_Managers (); // Gee! What's this do? Set_Data_Array (); // Some initializations. Init_Variables (); // More init. HideMenuBar (); // Take a little off the top. Set_Up_Window (); // Get ready for showtime. HideCursor (); // Get rid of "the fly." Main_Event_Loop (); // ACTION! ShowCursor (); // Action's over, bring back some control. ShowMenuBar (); // Bring this back. DisposeWindow (&window);// Dump this, an the RAM in rode in on. ExitToShell (); // Get back home, Loretta! } // We're in the Finder. -Ken- --------------------------- >From rod@faceng.anu.edu.au Subject: C code for scrollable application help Date: 3 Mar 1994 22:19:21 GMT Organization: Department of Engineering, ANU, Australia I recall from somewhere that there exists some code example for providing a scrollable text window designed for providing help information (possibly with an indexing facility). Can someone direct me to the source (if it exists)? My Freeware application has full Balloon help but I'd like to complement it with information akin to readme files. Thanks in advance. Rod +++++++++++++++++++++++++++ >From kidwell@wam.umd.edu (Christopher Bruce Kidwell) Date: 4 Mar 1994 14:30:05 GMT Organization: University of Maryland, College Park In article <2l5npaINN75o@dubhe.anu.edu.au>, wrote: >I recall from somewhere that there exists some code example >for providing a scrollable text window designed for providing >help information (possibly with an indexing facility). Can someone >direct me to the source (if it exists)? on mac.archive.umich.edu: /development/source/help.cpt.hqx It uses a styled TEXT resource to display scrollable text with a popup menu to jump to different sections. That version shows the help in a modal dialog box -- I don't know if there's a moveable modal version out there anywhere. Chris Kidwell kidwell@wam.umd.edu +++++++++++++++++++++++++++ >From chuck@gte.com (Chuck Hoffman) Date: Fri, 4 Mar 1994 15:19:53 GMT Organization: GTE Laboratories In article <2l5npaINN75o@dubhe.anu.edu.au>, rod@faceng.anu.edu.au wrote: > I recall from somewhere that there exists some code example > for providing a scrollable text window designed for providing > help information (possibly with an indexing facility). Can someone > direct me to the source (if it exists)? > > My Freeware application has full Balloon help but I'd like to complement > it with information akin to readme files. > > Thanks in advance. > > Rod You might find the Help routines useful in the sample application Chassis 6.0. The text is simple, non-styled text. The text and the selection list are both scrollable. The window is not a dialog, and can remain open while other windows are in use. The text is kept in the resource fork. The Help menu item is on the Apple menu. In release 6.1 it will be moved to the Help (baloon) menu. (6.1 will also be AppleEvent aware.) Chassis 6.0 is freeware. It is available at mac.archive.umich.edu and its mirror sites, also at CompuServe and America OnLine. Chassis 6.0 is also available directly from us at ftp.gte.com, file /pub/chuck/Chassis_6.0.sea.hqx DO NOT USE THE VERSION AT SUMEX-AIM.STANFORD.EDU. Inexplicably, they never posted the new version. The one they have, 4.3 or so, is not 32-bit clean and won't compile with THINK C 6.0. (Don't ask me... I sent the new version to them twice.) -- Chuck Hoffman GTE Laboratories, Waltham, MA, USA 617-466-2131 - ------------------------------------------------ I'm not sure why we're here, but I am sure that while we're here we're supposed to help each other. - ------------------------------------------------ +++++++++++++++++++++++++++ >From Robert Hess Date: Wed, 9 Mar 1994 02:40:56 GMT Organization: MacWEEK In article <2l7gld$kgq@cville-srv.wam.umd.edu> Christopher Bruce Kidwell, kidwell@wam.umd.edu writes: >on mac.archive.umich.edu: /development/source/help.cpt.hqx >It uses a styled TEXT resource to display scrollable text with a popup >menu to jump to different sections. That version shows the help in a >modal dialog box -- I don't know if there's a moveable modal version >out there anywhere. You!re thinking of James Walker!s !show_help!, version 2.0 of which offers a movable modal. ======================================================================= ==== Robert Hess, WEEKgeek AppleLink: WNDZSX MacWEEK CompuServe: 72511,333 301 Howard America Online: MacWEEK San Francisco, Calif. 94105 MCI: RHESS (415) 243-3576 days Internet: (415) 243-3651 fax robert_hess@macweek.ziff.com (415) 647-5549 nights I speak for myself. And sometimes not even that. ======================================================================= ==== --------------------------- >From mfi@i-link.com (MicroFrontier Inc.) Subject: Can C be as fast as Assembler? (next...on the McLaughlin Group) Date: 28 Feb 1994 09:37:13 -0600 Organization: I-Link, Ltd., Des Moines, IA, USA - 515/255-2754 OK, I've heard both sides of the story here...some developers say that C can be as fast as assembler (or at least very, very close), provided it is written well enough. Other say that C code doesn't get anywhere near the speed of assembler, no matter how it's written. Now, I would imagine that C can get closer to assembly depending on the task that is being done....what tasks would those be? What's the best way to optimize C? And....which compiler (MPW C, Symantec C, or Metrowerks C) do you think produces the fastest C code (with all optimization turned on)? Which do you think produces the best quality code (if being the fastest doesn't make it the best quality by default)? Please post responses to the net...I'm sure this is something we can all benefit from. Also, please try to keep it civil. :-) -kevin +++++++++++++++++++++++++++ >From chyang@quip.eecs.umich.edu (Chung-Hsiung Yang) Date: 28 Feb 1994 16:11:50 GMT Organization: University of Michigan EECS Dept., Ann Arbor, MI In article <2kt339$589@ilink1.i-link.com>, mfi@i-link.com (MicroFrontier Inc.) writes: |> |> |> OK, I've heard both sides of the story here...some developers say that C |> can be as fast as assembler (or at least very, very close), provided it is |> written well enough. Other say that C code doesn't get anywhere near the |> speed of assembler, no matter how it's written. |> |> Now, I would imagine that C can get closer to assembly depending on the |> task that is being done....what tasks would those be? |> |> What's the best way to optimize C? I don't think this is really a good way to look at both sides of the world. I tend to agree that assembler will be faster than C generated code because programming in assembler requires the programmer to optimize (some what) the code as you go alone because you are dealing with much lower semantics than C. On the other hand, the level of optimization one could do with C is really more dependent on the compiler itself. But look what you are doing here. What do you want to do with assembler vs. C? If you restrict yourself to the assembler world, then you are limited to pretty small programs with pretty limited software architecture. Maybe you could write routines for a small, but very fast computation that does for example some process in digital signal processing. Because of the overhead in C, you would probably not be able to achieve the speed that you could obtain with C. But imagine yourself writing a 100,000 line code in C. Quite a big project. Imagin writing the same code in assembly, you will probably have to write close to a million line or more. When you get to a million lines of code in assember, how do you optimize it? It is a scary thought, I wouldn't do it. In this case I would rather depend on a well designed C compiler to do the job. In this case, I think for very big programs C would very likely produce faster codes because there is no way for human beings to program codes that big in assembly. Also when you get to that programs that size, there are many tricks that one could play to optimize the code than assembly because, the notion of a high level software architecture such as object oriented design just could not be easily achieved by assembly. (You could do it, but it will be very hard). - Chung Yang |> And....which compiler (MPW C, Symantec C, or Metrowerks C) do you think |> produces the fastest C code (with all optimization turned on)? Which do |> you think produces the best quality code (if being the fastest doesn't |> make it the best quality by default)? |> |> Please post responses to the net...I'm sure this is something we can all |> benefit from. Also, please try to keep it civil. :-) |> |> |> |> -kevin +++++++++++++++++++++++++++ >From mssmith@afterlife.ncsc.mil (M. Scott Smith) Date: Mon, 28 Feb 1994 16:30:42 GMT Organization: The Great Beyond In article <2kt546$23o@zip.eecs.umich.edu> chyang@quip.eecs.umich.edu (Chung-Hsiung Yang) writes: >In article <2kt339$589@ilink1.i-link.com>, mfi@i-link.com (MicroFrontier Inc.) writes: >|> >|> >|> OK, I've heard both sides of the story here...some developers say that C >|> can be as fast as assembler (or at least very, very close), provided it is >|> written well enough. Other say that C code doesn't get anywhere near the >|> speed of assembler, no matter how it's written. >|> >|> Now, I would imagine that C can get closer to assembly depending on the >|> task that is being done....what tasks would those be? >|> >|> What's the best way to optimize C? Well, first, I'd say in many cases a lot of blame is put on the compiler producing "unoptimized" code when the user could in fact be optimizing their program. Meaning, often great speed increases can be seen by changing the way your program does certain things. If you do a sort, are you using a bubble sort or a quicker sort? Things like that. Once you've done a good job in that arena, then it comes time when you can benefit from better code production. Most compilers (such as Think C) have a "dissassemble" option that allow you to look at the assembly the compiler is producing. This is helpful if you know assembly; if you don't, it may not be too useful. But if you can read assembly, you can see exactly how the compiler is interpreting your code and try making modifications to your code so that the resultant assembly is better. Compilers are smart, but they're not geniouses -- often switching two lines around will signal the compiler to use some trick to make something much quicker. I wouldn't recommend writing in Assembly unless you absolutely need to; that will be weighted yourself down with concrete bricks when you want to take your program into the future. The PowerPC is an excellent case in point. The programmers who are porting their applications in two days are the ones who don't have any of their code in assembly. But a knowledge of 680x0 or PPC assembly is useful; again, you can tweak your C code around so it results in better assembly production with your compiler, without jeopardizing future compatibility. >|> And....which compiler (MPW C, Symantec C, or Metrowerks C) do you think >|> produces the fastest C code (with all optimization turned on)? Which do >|> you think produces the best quality code (if being the fastest doesn't >|> make it the best quality by default)? I think this is impossible to say. Each compiler is different, and each one might produce better code in some places and not others. Unless one has obvious code generator flaws, they're all probably pretty good. Code optimization (on the compiler's part) is tricky stuff, from what I understand. Remember: the compiler basically just does the "brute force" work of taking your C and transforming it into working, equivalent assembly. This doesn't require much "smarts" on the compiler's part. To look at the C code, and to find tricks for performing the same function with less instructions, takes great problem-solving skills and insight which humans often have, but which is difficult to duplicate in computers. Each compiler author will no doubt provide that "intelligence" in their optimizers in different ways. Each compiler probably knows different tricks. One routine of yours might result in lots of tricks with compiler X, but none in compiler Y. But with your next routine the reverse might be true. I don't know how to define "best quality" in terms other than speed. Presumably, a compiler is going to produce _100% working code_. There's no room for errors (on the compiler's part, anyway). So you'd expect any code a compiler produces to work. The next criterion is "how quickly does it work?" This is where the optimization comes in. I can't really think of better ways to measure quality of code generation, if you make the assumption that any code coming from a compiler isn't going to have any bugs introduced by the compiler. (That may not always be a valid assumption.) Just my thoughts.. Scott - - M. Scott Smith (mssmith@afterlife.ncsc.mil) Macintosh developer.. Student.. Ski bum. Eater of Kellog's Frosted Flakes. "Last stop for fuel on the information highway" +++++++++++++++++++++++++++ >From neeri@iis.ee.ethz.ch (Matthias Neeracher) Date: 28 Feb 94 18:05:12 Organization: Integrated Systems Laboratory, ETH, Zurich In article <2kt339$589@ilink1.i-link.com>, mfi@i-link.com (MicroFrontier Inc.) writes: > OK, I've heard both sides of the story here... Really? I haven't seen a hardcore assembler advocate here in a long time. > some developers say that C > can be as fast as assembler (or at least very, very close), provided it is > written well enough. Other say that C code doesn't get anywhere near the > speed of assembler, no matter how it's written. Assembler is much slower than C in several respects: - Almost all code (with a few exceptions) takes longer to write, debug, and maintain in Assembler. Note that for the same reasons, C++ is also faster than C, Eiffel is faster than C++, and Perl for some tasks is much faster than all of them. - You will find it easier to identify and rewrite speed critical parts in a C program than in an assembler program. - A C application compiled with an "Optimizing for PowerPC" compiler will run circles around your 680X0 assembler code. > Now, I would imagine that C can get closer to assembly depending on the > task that is being done....what tasks would those be? Depends also a lot on the compiler and the target processor. The 680X0 is a reasonable code generation target, and so is the PowerPC. In some ways, the PowerPC will be easier, but new factors like instruction scheduling come into play. I think assembly language progarmmers will have a harder time beating compilers on PowerPCs, since instruction scheduling is rather hard to do in one's head (Except if you are the infamous Mel, who programmed rotating disk memory machines). > What's the best way to optimize C? Use a good compiler. Profile. Rewrite critical sections. repeat. > And....which compiler (MPW C, Symantec C, or Metrowerks C) do you think > produces the fastest C code (with all optimization turned on)? ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ One problem I ahve with Mac C compilers is that at least for some of them, turning on optimizaion is too dangerous to do for an entire program. > Which do > you think produces the best quality code (if being the fastest doesn't > make it the best quality by default)? I'd take a reliable compiler over a fast compiler or one producing fast code anytime. > Also, please try to keep it civil. :-) With a topic like this?? Matthias - --- Matthias Neeracher neeri@iis.ethz.ch "And I won this ribbon in a Degradation Contest at the Teheran meeting of Junkies Anonymous" -- William Burroughs, _The Naked Lunch_ +++++++++++++++++++++++++++ >From peter@ncrpda.curtin.edu.au (Peter N Lewis) Date: 1 Mar 1994 12:03:20 +0800 Organization: NCRPDA, Curtin University mfi@i-link.com (MicroFrontier Inc.) writes: >OK, I've heard both sides of the story here...some developers say that C >can be as fast as assembler (or at least very, very close), provided it is >written well enough. Other say that C code doesn't get anywhere near the >speed of assembler, no matter how it's written. C or Pascal produce about teh same quality code. It's generally about half the speed of reasonably tight assembly code (although an important thing to remember before converting your code from C/Pascal to asm is that while you might get a double speed imporvement, redesigning the *algorithm* used may get you orders of magnitude of imporvement, and also asm can't be recompiled on the PPC to get you ~4 times speed improvement going native instead of interpreted. >Now, I would imagine that C can get closer to assembly depending on the >task that is being done....what tasks would those be? Small, simple, processor intensive things can be done more more quickly in asm (eg BlockMove, Character translation (ISO<->Mac, <->, BinHex/UU translation, etc)). Large things are better written in a high level language for the reasons stated above (easier to improve the algorithm, easier to compile on faster machines) (not to mention all the other obvious advantages (portability (kinda ;-), maintenance (kinda ;-), etc). >What's the best way to optimize C? Don't use any pointers. Pointers screw up optimizing compilers. >And....which compiler (MPW C, Symantec C, or Metrowerks C) do you think >produces the fastest C code (with all optimization turned on)? Which do >you think produces the best quality code (if being the fastest doesn't >make it the best quality by default)? No idea, they are probably all withing 10% (as is Pascal, normally on the faster side). Peter. -- Peter N Lewis Ph: +61 9 368 2055 +++++++++++++++++++++++++++ >From resnick@cogsci.uiuc.edu (Pete Resnick) Date: Mon, 28 Feb 1994 22:25:15 -0600 Organization: University of Illinois at Urbana-Champaign In article <2kueq8$865@ncrpda.curtin.edu.au>, peter@ncrpda.curtin.edu.au (Peter N Lewis) wrote: >>What's the best way to optimize C? > >Don't use any pointers. Pointers screw up optimizing compilers. What?!?! Sometimes using pointers is the only way to get a really stupid compiler to do register allocation and register loading properly. pr -- Pete Resnick (...so what is a mojo, and why would one be rising?) Graduate assistant - Philosophy Department, Gregory Hall, UIUC System manager - Cognitive Science Group, Beckman Institute, UIUC Internet: resnick@cogsci.uiuc.edu +++++++++++++++++++++++++++ >From ari@world.std.com (Ari I Halberstadt) Date: Tue, 1 Mar 1994 07:17:57 GMT Organization: The World Public Access UNIX, Brookline, MA In article , Matthias Neeracher wrote: > - Almost all code (with a few exceptions) takes longer to write, debug, and > maintain in Assembler. Note that for the same reasons, C++ is also faster > than C, Eiffel is faster than C++, and Perl for some tasks is much faster > than all of them. If only Eiffel were more widely available (like, for the mac), and a bit cheaper for us poorer programmers. Five years ago I fell in love with Eiffel, but it's still not on the mac (except for A/UX), which makes it a bit slower than C or C++. -- Ari Halberstadt ari@world.std.com #include "These beetles were long considered to be very rare because very few entomologists look for beetles in the mountains, in winter, at night, during snow storms." -- Purves W. K., et al, "Life: The Science of +++++++++++++++++++++++++++ >From jim@brunner.wf.com (Jim Brunner) Date: 28 Feb 94 14:57:34 GMT Organization: (none) In article <2kt339$589@ilink1.i-link.com>, you write: > > OK, I've heard both sides of the story here...some developers say that C > can be as fast as assembler (or at least very, very close), provided it is > written well enough. Other say that C code doesn't get anywhere near the > speed of assembler, no matter how it's written. > > Now, I would imagine that C can get closer to assembly depending on the > task that is being done....what tasks would those be? > > What's the best way to optimize C? The best way to optimize ANY program in ANY language is to take a hard look at the algorithms in use. For almost any program with noticable performance problems, 90% better speed is all in the algorithm. It's only when you get down to those last few tweeks for fractions of a percent that assembler *might* make any difference. (Of course, I'm talking in the general case - like the original poster. There are, of course, exceptions.) Try profiling your code. Run your program with a profiler (like the one included with Think C) and find out where it's spending it's time. Most likely, 90% of the time will be in 10% of the code. Look at that 10% and ignore the rest. Look at the algorithms first - searching a linked list instead of a binary tree? Make a decision: How many people does this affect, how critical is it? If the program in question isn't very critical, it might be more cost effective to ignore the problem. If not too many people use the program, it may be more cost effective to buy a faster machine. Assembly language is expensive. It's expensive in programmer time and maintenance cost. Go further only for those few lines of highly critical code - realize that these changes are not necessarily beneficial across platforms. Start by disassembling the code generated by the C compiler. Take a look at how slight coding changes affect the generated code. Only on highly critical sections of code (counting clock cycles here), go to assembler and hand code the critical section (if the compiler didn't do it well enough to begin with. These days, assembler code is almost extinct. I've seen it used recently on a small subroutine that was part of the firmware running on a DSP chip - there was a timing limitation in # of clock cycles. - - Jim Brunner (jim@brunner.wf.com) +++++++++++++++++++++++++++ >From gregor@nrlfs1.nrl.navy.mil (joe gregor) Date: Tue, 1 Mar 1994 14:29:37 GMT Organization: NRL In Article <2kueq8$865@ncrpda.curtin.edu.au>, peter@ncrpda.curtin.edu.au (Peter N Lewis) wrote: >C or Pascal produce about teh same quality code. It's generally about >half the speed of reasonably tight assembly code... >>And....which compiler (MPW C, Symantec C, or Metrowerks C) do you think >>produces the fastest C code (with all optimization turned on)? Which do >>you think produces the best quality code (if being the fastest doesn't >>make it the best quality by default)? > >No idea, they are probably all withing 10% (as is Pascal, normally on >the faster side). I had always heard/read that C was the fastest language next to asm. I *never* heard/read that Pascal was even close, let alone faster. Please identify your references so I may (re)educate myself. -- Joe ________________________________________________________________________________ Joseph Gregor | gregor@ccf.nrl.navy.mil | THIS SPACE INTENTIONALLY LEFT BLANK. tmh@eng.umd.edu | ________________________________|_______________________________________________ +++++++++++++++++++++++++++ >From jwbaxter@olympus.net (John W. Baxter) Date: Tue, 01 Mar 1994 08:35:48 -0800 Organization: Internet for the Olympic Peninsula In article <9402281957344983@brunner.wf.com>, jim@brunner.wf.com (Jim Brunner) wrote: > Go further only for those few lines of highly critical code - realize that > these changes are not necessarily beneficial across platforms. Keeping in mind that "across platforms" above in some cases includes differences among the 68000, 68020, 68030, and 68040. You may have to decide which of those you want to target (probably 68040 these days, say I who still runs a 68030), at the cost of hurting the others. -- John Baxter Port Ludlow, WA, USA [West shore, Puget Sound] jwbaxter@pt.olympus.net +++++++++++++++++++++++++++ >From mmorgan@gpu.srv.ualberta.ca (Martin Morgan) Date: 1 Mar 1994 17:31:22 GMT Organization: University of Alberta Peter N Lewis (peter@ncrpda.curtin.edu.au) wrote: : mfi@i-link.com (MicroFrontier Inc.) writes: : >What's the best way to optimize C? : Don't use any pointers. Pointers screw up optimizing compilers. Is this true? Is it true in a more restricted sense, don't use pointers in speed-critical sections of code? Martin Morgan University of Alberta +++++++++++++++++++++++++++ >From nagle@netcom.com (John Nagle) Date: Tue, 1 Mar 1994 18:40:47 GMT Organization: NETCOM On-line Communication Services (408 241-9760 guest) mfi@i-link.com (MicroFrontier Inc.) writes: >OK, I've heard both sides of the story here...some developers say that C >can be as fast as assembler (or at least very, very close), provided it is >written well enough. Other say that C code doesn't get anywhere near the >speed of assembler, no matter how it's written. Depends on the compiler. Both MPW C and Symantec C++ are worse than, say, mainframe FORTRAN compilers of the late 1960s. Good compilers for the 680x0 machines exist, but not on the Mac. MetaWare High-C is available for the 68000, but they market it to embedded systems makers, not Macs. The Sun compiler for the older 68000 Suns were reasonably good as well, and one large CAD package for the Mac used to be cross-compiled on a Sun to take advantage of this. I like to try compiling int i; char tab1[100]; tab2[100]; for (i=0; i<100; i++) tab1[i] = tab2[i]; and see what the inner loop looks like. Ideally, the inner loop should have two instructions, but I've seen as many as 12. Incidentally, using subscripts vs pointer incrementation does not make much difference with most modern compilers. Even SC++ gets this one right. The big SC++ problem is really dumb register usage. One sees lots of unnecessary register-to-register moves in SC++ output. The compiler never seems to take full advantage of all the registers available (it's a port of a compiler for Intel CPUs, which have fewer registers). In the compiler-design world, using all the registers effectively is generally considered a win even when not "optimizing", because it takes less time to figure out which register to use than to generate the register-to-register moves and stack manipulation required when doing it wrong. The global optimizer does a good job, though, except when it makes mistakes. MPW C has a better code generator but a weaker global optimizer. I tried this simple test case on a pre-release MetroWerks compiler, and it generated OK, but not spectacular code. Still, if there was a compiler for the Mac that generated state of the art optimized code, programs would be perhaps twice as fast in some cases, and somewhat smaller. John Nagle +++++++++++++++++++++++++++ >From d88-jwa@mumrik.nada.kth.se (Jon Wätte) Date: 1 Mar 1994 23:13:52 GMT Organization: Royal Institute of Technology, Stockholm, Sweden >>>What's the best way to optimize C? >>Don't use any pointers. Pointers screw up optimizing compilers. >What?!?! Sometimes using pointers is the only way to get a really stupid >compiler to do register allocation and register loading properly. Yes, but at the same time, pointers (and especially when assigned to addresses of local variables) can limit more sophisticated compilers, since they can't do a full analysis of where your pointer might point and what short-cuts it can take. -- -- Jon W{tte, h+@nada.kth.se, Mac Hacker Deluxe -- Cookie Jar: Vanilla Yoghurt with Crushed Oreos. +++++++++++++++++++++++++++ >From resnick@cogsci.uiuc.edu (Pete Resnick) Date: Tue, 01 Mar 1994 17:32:34 -0600 Organization: University of Illinois at Urbana-Champaign In article <2l0i7g$o51@news.kth.se>, d88-jwa@mumrik.nada.kth.se (Jon Wtte) wrote: >>What?!?! Sometimes using pointers is the only way to get a really stupid >>compiler to do register allocation and register loading properly. > >Yes, but at the same time, pointers (and especially when assigned to >addresses of local variables) can limit more sophisticated compilers, >since they can't do a full analysis of where your pointer might point >and what short-cuts it can take. Compilers are generally stupid. If you disassemble code resources you'll see that MPW C is stupider than THINK C, but THINK C is pretty stupid too. I am waiting for someone to release a compiler that does peep-hole optimization. Does anyone by any chance know if MetroWorks does so? Here's a rash claim (flames against my ignorance welcome): Stupid compilers are why RISC processors do better than CISC processors. If compilers were smart enough to take advantage of all of the interesting addressing modes and instructions on CISC architectures, CISCs would overall be faster at running programs than RISCs. (*Duck*) pr -- Pete Resnick (...so what is a mojo, and why would one be rising?) Graduate assistant - Philosophy Department, Gregory Hall, UIUC System manager - Cognitive Science Group, Beckman Institute, UIUC Internet: resnick@cogsci.uiuc.edu +++++++++++++++++++++++++++ >From platypus@cirrus.som.cwru.edu (Gary Kacmarcik) Date: 02 Mar 1994 01:25:16 GMT Organization: Case Western Reserve University, Cleveland, Ohio (USA) In article <2kvu5a$b06@quartz.ucs.ualberta.ca> mmorgan@gpu.srv.ualberta.ca (Martin Morgan) writes: Peter N Lewis (peter@ncrpda.curtin.edu.au) wrote: : mfi@i-link.com (MicroFrontier Inc.) writes: : >What's the best way to optimize C? : Don't use any pointers. Pointers screw up optimizing compilers. Is this true? Is it true in a more restricted sense, don't use pointers in speed-critical sections of code? using pointers does NOT directly result in slower code. there are many cases where pointers can be used to generate significantly optimized code. however, if you use pointers you should be aware of the fact that the code optimizer now can make fewer assumptions about your code, and thus it may not be able to apply certain optimizations. get a good book on compiler writing (eg: Aho, Sethi & Ullman) and read the sections on optimizing. understanding how optimizers work will greatly aid your programming: you'll have a better understanding of what the compiler can annd cannot do. -gary j kacmarcik platypus@curie.ces.cwru.edu +++++++++++++++++++++++++++ >From siegel@netcom.com (Rich Siegel) Date: Wed, 2 Mar 1994 05:41:38 GMT Organization: Bare Bones Software In article resnick@cogsci.uiuc.edu (Pete Resnick) writes: > >Compilers are generally stupid. If you disassemble code resources you'll >see that MPW C is stupider than THINK C, but THINK C is pretty stupid too. >I am waiting for someone to release a compiler that does peep-hole >optimization. Does anyone by any chance know if MetroWorks does so? Metrowerks does. So does THINK C. So does MPW C. >Here's a rash claim (flames against my ignorance welcome): Stupid >compilers are why RISC processors do better than CISC processors. If >compilers were smart enough to take advantage of all of the interesting >addressing modes and instructions on CISC architectures, CISCs would >overall be faster at running programs than RISCs. (*Duck*) I have a reality adjustment for you. On the 68020, many of the fancy addressing modes are either a wash or are actually slower than an equivalent sequence of instructions using 68000-only addressing modes. R. -- Rich Siegel % siegel@netcom.com % Principal, Bare Bones Software --> For information about BBEdit, finger bbedit@world.std.com <-- "He then proceeded to give a history of the universe, in real time." +++++++++++++++++++++++++++ >From peter@ncrpda.curtin.edu.au (Peter N Lewis) Date: 2 Mar 1994 13:20:52 +0800 Organization: NCRPDA, Curtin University >From: chewy@shell.portal.com (Paul Snively) >... >q = *p++; >... >q = *p++; >... > >Note that the above code features common subexpressions (in fact, for No, they are not cse's. To be eleigable for cse elemination, the expression must not have any side effects. I'd quote chapter and verse out of the Dragon book, but I don't have it handy. >The moral of the story is twofold: a) try to write clean, readable code >that doesn't rely on nasty compound functions, especially with >side-effects; and b) know thy optimizer. If you need help, disassemble >the code and see what your optimizer is doing behind your back. No, a) is certainly true, but b) should not be. The optimizer should never change the way your program behaves - if it does, either the optimizer is broken (like the THINK C optimizer), or your program is broken. >From: gregor@nrlfs1.nrl.navy.mil (joe gregor) >>No idea, they are probably all withing 10% (as is Pascal, normally on >>the faster side). > I had always heard/read that C was the fastest language next to asm. >I *never* heard/read that Pascal was even close, let alone faster. Please >identify your references so I may (re)educate myself. You've probably heard thousands of people tell you PCs are better than Macs. Try it and see for yourself. Pascal compilers will generally produce faster code than C compilers with equivalent source. Obviously, compilers vary a great deal, but the above is generally true. I wouldn't worry though, I expect the next generation of compilers will make C faster than Pascal, since Pascal compilers will get a lot less work done on them. Of course, then everyone will be using C++ and efficiency will be thrown right out the window, but such is life ;-) >From: mmorgan@gpu.srv.ualberta.ca (Martin Morgan) >: Don't use any pointers. Pointers screw up optimizing compilers. > >Is this true? Is it true in a more restricted sense, don't use pointers in >speed-critical sections of code? What I said is true, in that pointers screw up most really clever high level optimizations because the compiler has an impossible task of figuring out what you're doing. For example, if you do this: x = 5; for i:=1 to 100 do arr[i] = 0; end-for; y = x; (equivalent in C or Pascal). It is easy for the compiler to know that x has remained unchanged, that i is now undefined (and thus the register can be reused), that the array has been modified, that i is always between 1 and 100 and so no range checking needs to be done, that y can be assigned 5 directly, etc. If you instead do this: x=5; i=100; p = @arr[1]; while (i>0) do *p++ = 0; i--; end-while y = x; Now it is nearly impossible for the compiler to determine any of that. It has to be really clever to figure out that p only ever points to the array arr (a compiler would probably be allowed to assume this though, since it is undefined what happens if you increment a ptr outside of the area it started in). If the compiler can't figure that out, then all variables must be in memory (not registers) and all are potentially modified, so none of the optimizations above are available. Of course I wouldn't worry about this either, since most compilers don't do very much in the way of clever optimizing (and when they try, they usually screw it up)... Peter. -- Peter N Lewis Ph: +61 9 368 2055 +++++++++++++++++++++++++++ >From resnick@cogsci.uiuc.edu (Pete Resnick) Date: Wed, 02 Mar 1994 00:51:17 -0600 Organization: University of Illinois at Urbana-Champaign In article , siegel@netcom.com (Rich Siegel) wrote: >In article resnick@cogsci.uiuc.edu (Pete Resnick) writes: >> >>Compilers are generally stupid. If you disassemble code resources you'll >>see that MPW C is stupider than THINK C, but THINK C is pretty stupid too. >>I am waiting for someone to release a compiler that does peep-hole >>optimization. Does anyone by any chance know if MetroWorks does so? > >Metrowerks does. So does THINK C. So does MPW C. Surely you're kidding. I've seen MPW C move things back and forth between the same two registers (or worse, to and from the stack) oodles of times in a series of instructions with no other side effects. THINK C constantly does 'MOVE.L (A7)+,D3' followed by a 'TST.L D3' followed by a branch on condition code. And I've never seen either of them generate a DBcc instruction, even when I force feed it; I have lots of code for which THINK C generates: SUBQ.W #$1,D3 CMPI.W #$FFFF,D3 BNE.S *-$000C What *are* they looking for if not these kinds of things? As for the RISC/CISC argument I started, I've gotten lots of really cool mail in response, some in support and some against, but almost all of it saying, "It's a lot more complicated than you think." I figured as much, but I do thank everyone for their comments; I have learned a lot in the process. pr -- Pete Resnick (...so what is a mojo, and why would one be rising?) Graduate assistant - Philosophy Department, Gregory Hall, UIUC System manager - Cognitive Science Group, Beckman Institute, UIUC Internet: resnick@cogsci.uiuc.edu +++++++++++++++++++++++++++ >From d88-jwa@mumrik.nada.kth.se (Jon Wätte) Date: 2 Mar 1994 10:32:23 GMT Organization: Royal Institute of Technology, Stockholm, Sweden >I don't have my ANSI reference in front of me, but I distinctly recall >my shock upon reading something in it that made it quite clear that the >semantics of your program could differ between unoptimized and >optimized versions of your code. So far so good. >To give just one obvious example of how this could happen, consider a >function that uses a pointer p, and contains code like this: >q = *p++; >q = *p++; >let's say I compile this code without Common Subexpression Elimination. > The pointer gets incremented twice, which is probably what the author >had in mind. But if I compile with Common Subexpression Elimination >on, the compiler is completely free to evaluate the p++ once and stick >the result in a register. Oops. No, that's NOT legal, since this has a very well-defined semantic meaning. What the compiler CAN do, is change its behaviour for UNDEFINED cases, such as: foo ( q ++ , q ++ ) ; Where foo might be called as foo ( 0 , 1 ) or foo ( 1 , 0 ) depending on moon phase. Anyway, since your program shouldn't be relying on such UNDEFINED behaviour, the above allowance really isn't a problem. And since we all validate our code and use ASSERTs everywhere and step through it at the instruction level to verify it works right after we wrote it (using all available documentation) this group shouldn't have any "help me with my bug" questions either :-) :-) -- -- Jon W{tte, h+@nada.kth.se, Mac Hacker Deluxe -- "It was, in fact, cool as all get-out. Fortunately it was a little too late (historically speaking) to be groovy." -- Dennis Pelton +++++++++++++++++++++++++++ >From infosafe@panix.com (Infosafe Systems) Date: 2 Mar 1994 11:28:02 -0500 Organization: PANIX Public Access Internet and Unix, NYC In article , Rich Siegel wrote: >In article resnick@cogsci.uiuc.edu (Pete Resnick) writes: >>Here's a rash claim (flames against my ignorance welcome): Stupid ^^^^^^ >>compilers are why RISC processors do better than CISC processors. If ^^^ ^^^ ^^^^ ^^ ^^^^^^ ^^^^ ^^^^ >>compilers were smart enough to take advantage of all of the interesting >>addressing modes and instructions on CISC architectures, CISCs would >>overall be faster at running programs than RISCs. (*Duck*) I'm no expert but perhaps someone here could help me out. A few months ago I read a longish article in Byte? (I will post the reference tomorrow when I am in work, if anyone is interested) about the greatness of the MPC chip. One of the topics it discussed in detail was the fact that RISC chips have become popular, and more powerful than CISC chips for a bunch of reasons. These included fixed instruction length to eliminate bubbles in the pipeline, faster instruction decode because you don't have to spend time figuring out how long your instruction is, etc. The last point that they made was that RISC has "come out on top" *because* of improvements in compiler technology. The article said that optimizers were able to make better use of RISC instructions than CISC instructions, and this has been one of the motivating forces in developing RISC chips for workstations, where you have *good* compilers ;-) Opinions? Let the flames begin! (Again, I will post the article and a better summary, if anyone is interested) Bradford Smith +++++++++++++++++++++++++++ >From siegel@netcom.com (Rich Siegel) Date: Wed, 2 Mar 1994 16:06:46 GMT Organization: Bare Bones Software In article resnick@cogsci.uiuc.edu (Pete Resnick) writes: >In article , siegel@netcom.com (Rich Siegel) wrote: > >>In article resnick@cogsci.uiuc.edu (Pete Resnick) writes: >>> >>>Compilers are generally stupid. If you disassemble code resources you'll >>>see that MPW C is stupider than THINK C, but THINK C is pretty stupid too. >>>I am waiting for someone to release a compiler that does peep-hole >>>optimization. Does anyone by any chance know if MetroWorks does so? >> >>Metrowerks does. So does THINK C. So does MPW C. > >Surely you're kidding. I've seen MPW C move things back and forth between >the same two registers (or worse, to and from the stack) oodles of times >in a series of instructions with no other side effects. THINK C constantly >does 'MOVE.L (A7)+,D3' followed by a 'TST.L D3' followed by a branch on >condition code. And I've never seen either of them generate a DBcc >instruction, even when I force feed it; I have lots of code for which >THINK C generates: > > SUBQ.W #$1,D3 > CMPI.W #$FFFF,D3 > BNE.S *-$000C > >What *are* they looking for if not these kinds of things? I'm not kidding, and don't call me Shirley. :-) Don't confuse instruction selection and target analysis with peepholing, and DBRA is not a cure-all. One thing that THINK C (and THINK Pascal) do pretty well is branch-and-loop optimization (a "peephole" operation). All of the above-mentioned compilers do -some- peepholing. It may not be as thorough as you might like it, but it's done. R. -- Rich Siegel % siegel@netcom.com % Principal, Bare Bones Software --> For information about BBEdit, finger bbedit@world.std.com <-- "...yeah, I inhaled, and then I drank the bong water. So what're you gonna do about it?" - Dennis Miller, on Bill Clinton +++++++++++++++++++++++++++ >From ari@world.std.com (Ari I Halberstadt) Date: Wed, 2 Mar 1994 17:00:16 GMT Organization: The World Public Access UNIX, Brookline, MA In article , Pete Resnick wrote: >condition code. And I've never seen either of them generate a DBcc >instruction, even when I force feed it; I have lots of code for which >THINK C generates: > > SUBQ.W #$1,D3 > CMPI.W #$FFFF,D3 > BNE.S *-$000C > >What *are* they looking for if not these kinds of things? Try structure assignment in THINK C. The structure has to be big enough, or the compiler will just use word or long word move instructions. Here's an example of a DBF instruction: static void x(void) { struct { long a, b, c, d, e, f, g; } a, b; a = b; } x: 00000000 LINK A6,#$FFC8 00000004 LEA $FFE4(A6),A0 00000008 LEA $FFC8(A6),A1 0000000C MOVEQ #$06,D0 0000000E MOVE.L (A1)+,(A0)+ 00000010 DBF D0,*-$0002 ; 0000000E 00000014 UNLK A6 00000016 RTS 0000001C -- Ari Halberstadt ari@world.std.com #include "These beetles were long considered to be very rare because very few entomologists look for beetles in the mountains, in winter, at night, during snow storms." -- Purves W. K., et al, "Life: The Science of +++++++++++++++++++++++++++ >From jwbaxter@olympus.net (John W. Baxter) Date: Wed, 02 Mar 1994 10:03:00 -0800 Organization: Internet for the Olympic Peninsula In article <2l2eqi$j40@panix2.panix.com>, infosafe@panix.com (Infosafe Systems) wrote: > I'm no expert but perhaps someone here could help me out. A few months > ago I read a longish article in Byte? (I will post the reference > tomorrow when I am in work, if anyone is interested) about the greatness > of the MPC chip. > > One of the topics it discussed in detail was the fact that RISC chips have > become popular, and more powerful than CISC chips for a bunch of reasons. > These included fixed instruction length to eliminate bubbles in the > pipeline, faster instruction decode because you don't have to spend time > figuring out how long your instruction is, etc. > > The last point that they made was that RISC has "come out on top" *because* > of improvements in compiler technology. The article said that optimizers > were able to make better use of RISC instructions than CISC instructions, > and this has been one of the motivating forces in developing RISC chips > for workstations, where you have *good* compilers ;-) I've been around these crazy machines since around 1959 (more like 1952 if I get to count helping mother with her "homework" with her computer usage at JPL (she used IBM 650, ElectroData <> #1, IBM 704)). There have been periodic waves of RISCness, followed by retreats to CISCness. But the R is gradually winning...each retreat to the C direction goes a little less far. I'm beginning to thing that the current RISC wave is the real thing, and that there won't be much retreat. And I tend to agree that the advances in compiler technology are a major reason why RISC may well happen this time. In any case, as viewed from the late 1950s, everything we run today is RISC. Consider the NCR 304: 3-address instructions (add A to B putting the result in C). One of the interesting single machine instructions was "write-copy-read": write this record to that tape, read records from that other tape and copy them to that tape until one has a key which matches this test, and give me the matching record. Absolutely perfect for father-son file updates, which aren't done much any more. The 304 also had single instruction in-memory sorts and merges. -- John Baxter Port Ludlow, WA, USA [West shore, Puget Sound] jwbaxter@pt.olympus.net +++++++++++++++++++++++++++ >From mxmora@unix.sri.com (Matt Mora) Date: 2 Mar 1994 11:10:40 -0800 Organization: SRI International, Menlo Park, CA In article resnick@cogsci.uiuc.edu (Pete Resnick) writes: >Here's a rash claim (flames against my ignorance welcome): Stupid >compilers are why RISC processors do better than CISC processors. If >compilers were smart enough to take advantage of all of the interesting >addressing modes and instructions on CISC architectures, CISCs would >overall be faster at running programs than RISCs. (*Duck*) I don't think could be possible. Even though its called RISC the PowerPC chip has more instructions than the 68020 or so I've read. Its the complexity of the instruction that matters. An instuction on a RISC takes at max 5 clocks, but on a CISC who knows how long a certain intruction will take. There goes your pipelining. The beauty of RISC is its simplicity. With few transistors and less logic, its easy to get faster CPU speeds. That's why even before the first PowerPc mac shipped, its already be upgraded from 66 mhz to 80 mhz. A slower running CISC will never be able to keep up. Remember the 32k of on chip cache takes up a big chuck of the die size. Without the cache the die would be even smaller. (but slower) CISC is doomed because the art of writing tight fast code is becoming obsolete. If compiler writers won't do it why should we. Image how fast things would be on RISC with hand tuned assembly. Xavier -- ___________________________________________________________ Matthew Xavier Mora Matt_Mora@qm.sri.com SRI International mxmora@unix.sri.com 333 Ravenswood Ave Menlo Park, CA. 94025 +++++++++++++++++++++++++++ >From nagle@netcom.com (John Nagle) Date: Wed, 2 Mar 1994 19:30:41 GMT Organization: NETCOM On-line Communication Services (408 241-9760 guest) siegel@netcom.com (Rich Siegel) writes: >In article resnick@cogsci.uiuc.edu (Pete Resnick) writes: >>Compilers are generally stupid. If you disassemble code resources you'll >>see that MPW C is stupider than THINK C, but THINK C is pretty stupid too. >>I am waiting for someone to release a compiler that does peep-hole >>optimization. Does anyone by any chance know if MetroWorks does so? >Metrowerks does. So does THINK C. So does MPW C. But SC++ doesn't, judging from the code. >>Here's a rash claim (flames against my ignorance welcome): Stupid >>compilers are why RISC processors do better than CISC processors. If >>compilers were smart enough to take advantage of all of the interesting >>addressing modes and instructions on CISC architectures, CISCs would >>overall be faster at running programs than RISCs. (*Duck*) >I have a reality adjustment for you. On the 68020, many of the fancy >addressing modes are either a wash or are actually slower than an >equivalent sequence of instructions using 68000-only addressing modes. That's a problem with many machines with variable-length instructions. Check the timings for the 68030 and '040, though; the ratios get better as you throw more transistors at the problem. John Nagle +++++++++++++++++++++++++++ >From jvp@tools1.ee.iastate.edu (Jim Van Peursem) Date: 2 Mar 94 21:24:34 GMT Organization: Iowa State University, Ames, Iowa In <2l2obg$fca@unix.sri.com> mxmora@unix.sri.com (Matt Mora) writes: >In article resnick@cogsci.uiuc.edu (Pete Resnick) writes: >>Here's a rash claim (flames against my ignorance welcome): Stupid >>compilers are why RISC processors do better than CISC processors. If >>compilers were smart enough to take advantage of all of the interesting >>addressing modes and instructions on CISC architectures, CISCs would >>overall be faster at running programs than RISCs. (*Duck*) >I don't think could be possible. Even though its called RISC the PowerPC >chip has more instructions than the 68020 or so I've read. Its the >complexity of the instruction that matters. An instuction on a RISC takes at >max 5 clocks, but on a CISC who knows how long a certain intruction >will take. There goes your pipelining. Ah yes, the wandering definition of RISC. I saw a nice summary of what RISC means in comp.arch awhile back. It's more a function of cycles-per-instruction and number of registers, etc than the number of instructions supported. I agree with Pete that originally, RISC was driven by the fact that most compilers simply didn't support the complex instructions in some chips. But more accurately, some complex instructions in some chips were slower than their counter-part simple instructions to perform the same task. The VAX had several of these as I recall. Anyway, your definition of RISC taking a max 5 clock cycles is wrong. That depends on the depth of the pipe and other factors. >The beauty of RISC is its simplicity. With few transistors and less >logic, its easy to get faster CPU speeds. That's why even before the >first PowerPc mac shipped, its already be upgraded from 66 mhz to 80 mhz. But these days, the RISC processors have around the same number of transistors as their CISC counterparts. It's more a function of the regularity of the instructions. Since all instructions are the same size, and decode the same, it's easy to pipeline them and keep the pipe full. I say the reason RISC is winning the game now is because of the advancements in both the pipelines and compilers. >Image how fast things would be on RISC >with hand tuned assembly. It would be slower. :) Remembering all of the pipeline hazards by hand is very complex. A compiler can do a much better job of these kinds of scheduling issues for any reasonably sized routine. +---------------------------------------------------------------+ | Jim Van Peursem - Ph.D. Candidate - Ham Radio -> KE0PH | | Department of Electrical Engineering and Computer Engineering | | Iowa State University - Ames, IA 50011 : (515) 294-8339 | | internet - jvp@iastate.edu -or- jvp@cpre1.ee.iastate.edu | +---------------------------------------------------------------+ +++++++++++++++++++++++++++ >From jim@brunner.wf.com (Jim Brunner) Date: 2 Mar 94 02:57:28 GMT Organization: (none) In article , you write: > > Here's a rash claim (flames against my ignorance welcome): Stupid > compilers are why RISC processors do better than CISC processors. If > compilers were smart enough to take advantage of all of the interesting > addressing modes and instructions on CISC architectures, CISCs would > overall be faster at running programs than RISCs. (*Duck*) Actually, good RISC compilers are MORE difficult because of the difficulty of doing pipeline optimization. - - Jim Brunner (jim@brunner.wf.com) +++++++++++++++++++++++++++ >From jtbell@cs1.presby.edu (Jon Bell) Date: Wed, 2 Mar 94 23:06:40 GMT Organization: Presbyterian College, Clinton, South Carolina USA In article , joe gregor wrote: >In Article <2kueq8$865@ncrpda.curtin.edu.au>, peter@ncrpda.curtin.edu.au >(Peter N Lewis) wrote: > >>C or Pascal produce about teh same quality code. It's generally about >>half the speed of reasonably tight assembly code... > > I had always heard/read that C was the fastest language next to asm. >I *never* heard/read that Pascal was even close, let alone faster. Please >identify your references so I may (re)educate myself. It makes no sense to talk about the "speed" of a language (or rather, its compiled code) without reference to the compiler and the computer it's running on. There is nothing intrinsic in the design of Pascal or C that would make one faster than the other. What counts is the implementation. On some computers, C is indeed much faster than Pascal because their C compilers are more "efficient" in that respect than their Pascal compilers. On the Mac, this is not the case. The two languages run just about neck and neck when you compare (say) Think Pascal with Think C or MPW Pascal with MPW C. I don't have specific references handy, but I'll browse through my back issues of MacTutor and see if I can find something to post if no one beats me to it. -- Jon Bell Presbyterian College Dept. of Physics and Computer Science Clinton, South Carolina USA +++++++++++++++++++++++++++ >From especkma@reed.edu (Erik A. Speckman) Date: 3 Mar 1994 06:56:45 GMT Organization: Hellmouth-Heater Democrat In article , Jim Van Peursem wrote: > I agree with Pete that originally, RISC was driven by the fact that >most compilers simply didn't support the complex instructions in some >chips. But more accurately, some complex instructions in some chips were >slower than their counter-part simple instructions to perform the same >task. The VAX had several of these as I recall. Anyway, your definition >of RISC taking a max 5 clock cycles is wrong. That depends on the depth >of the pipe and other factors. No, no, no its that implimenting complex instructions costs silicon that could be better used increacing pipeline complexity or chache and speeding up all operations. No, its a desert topping. No, it is all those things and a desert topping to boot. > -- ____________________________________________________________________________ Erik Speckman especkma@romulus.reed.edu GBDS Workstation- A high-performance microcomputer designed to run benchmarks. +++++++++++++++++++++++++++ >From rang@winternet.mpls.mn.us (Anton Rang) Date: 03 Mar 1994 14:37:47 GMT Organization: Minnesota Angsters In article resnick@cogsci.uiuc.edu (Pete Resnick) writes: >In article <2kueq8$865@ncrpda.curtin.edu.au>, peter@ncrpda.curtin.edu.au >(Peter N Lewis) wrote: > >>>What's the best way to optimize C? >> >>Don't use any pointers. Pointers screw up optimizing compilers. > >What?!?! Sometimes using pointers is the only way to get a really stupid >compiler to do register allocation and register loading properly. Yes, but if you have a really *smart* compiler, pointers screw it up royally, especially if you are passing parameters around by address. Once you take the address of an object, almost any access through a pointer stops the compiler from being able to optimize it. So you're often better off to use a temporary variable, like -- temp_x = x; do_something(&temp_x); x = temp_x; and keep 'x' local, so the compiler can (a) keep it in a register, and (b) assume that its value isn't changed by almost *every* call and every assignment through a pointer. Of course, using the best algorithms is still the best way to optimize *any* language. -- Anton -- Anton Rang (rang@acm.org) +++++++++++++++++++++++++++ >From Brad Koehn Date: 4 Mar 1994 01:58:54 GMT Organization: University of Wisconsin In article <2l17nk$1ku@ncrpda.curtin.edu.au> Peter N Lewis, peter@ncrpda.curtin.edu.au writes: >Try it and see for yourself. Pascal compilers will generally produce faster >code than C compilers with equivalent source. Obviously, compilers vary >a great deal, but the above is generally true. I wouldn't worry though, >I expect the next generation of compilers will make C faster than Pascal, >since Pascal compilers will get a lot less work done on them. Of course, >then everyone will be using C++ and efficiency will be thrown right out >the window, but such is life ;-) Heck, check out the XLC and XLC++ compilers from IBM. Talk about well-built! From what I understand, Apple was using them for quite a while to compile for PPC (ahh, MPW, you dog you). Anyway, these two beasties can do some really nice magic with your code. And IBM just keeps improving them. Now, if only the Mac had it so good... Oh, that's right, PowerOpen lets me use both AIX and MAS on the same machine! So I can use XLC++ with my Mac code! Life is so good! _________________________________________________________________________ Brad Koehn Data Transformations, Inc. koehn@macc.wisc.edu +++++++++++++++++++++++++++ >From Brad Koehn Date: 4 Mar 1994 02:06:09 GMT Organization: University of Wisconsin In article Anton Rang, rang@winternet.mpls.mn.us writes: > Yes, but if you have a really *smart* compiler, pointers screw it up >royally, especially if you are passing parameters around by address. >Once you take the address of an object, almost any access through a >pointer stops the compiler from being able to optimize it. So you're >often better off to use a temporary variable, like -- > > temp_x = x; > do_something(&temp_x); > x = temp_x; > >and keep 'x' local, so the compiler can (a) keep it in a register, and >(b) assume that its value isn't changed by almost *every* call and >every assignment through a pointer. Can't the compiler just check the code for do_something and see if it changes temp_x? I realize it would be a real pain for compilers, but so what? Life is pain. Anyone who tells you otherwise is trying to sell you something.* Just build a table for each function that checks to see if the parameters are changed. Sigh. I guess I should have taken that compiler class... * The Man in Black, "The Princess Bride" _________________________________________________________________________ Brad Koehn Data Transformations, Inc. koehn@macc.wisc.edu +++++++++++++++++++++++++++ >From pottier@prao.ens.fr (Francois Pottier) Date: 4 Mar 1994 14:26:33 GMT Organization: Ecole Normale Superieure, PARIS, France In article <2l652h$aa6@news.doit.wisc.edu>, Brad Koehn wrote: >> temp_x = x; >> do_something(&temp_x); >> x = temp_x; >Can't the compiler just check the code for do_something and see if it >changes temp_x? I realize it would be a real pain for compilers, but so No, it can't. This looks like a typical undecidable problem. In a language with pointers like C or Pascal, there are tons of ways of modifying a variable. For instance, look at this: long x, y; long *p; p = &x; p++; *p = 0; This code modifies y. Hard to tell, uh ? I think it should be relatively easy to demonstrate that the problem is undecidable. I could try writing a proof if you wish. -- Francois Pottier ___ ___ _ _ / ___ ___ ___ pottier@dmi.ens.fr /_ /__/ /_| /| / / / / / / /__ / / \ / | / |/ /___ /__/ / ___/ _ / +++++++++++++++++++++++++++ >From mmorgan@gpu.srv.ualberta.ca (Martin Morgan) Date: 4 Mar 1994 16:02:53 GMT Organization: University of Alberta Francois Pottier (pottier@prao.ens.fr) wrote: : Brad Koehn wrote: : >> temp_x = x; : >> do_something(&temp_x); : >> x = temp_x; : >Can't the compiler just check the code for do_something and see if it : >changes temp_x? I realize it would be a real pain for compilers, but so : No, it can't. This looks like a typical undecidable problem. In a language or the user declare do_something as do_something (const type-of-x *)?. Is that a legal declaration? : variable. For instance, look at this: : long x, y; : long *p; : p = &x; : p++; : *p = 0; : This code modifies y. Hard to tell, uh ? surely memory allocation of x and y is left to the compiler, so there's no guarantee that anything relevant to the snippet is modified by *p = 0? Martin Morgan +++++++++++++++++++++++++++ >From rang@winternet.mpls.mn.us (Anton Rang) Date: 05 Mar 1994 15:44:45 GMT Organization: Minnesota Angsters In article <2l652h$aa6@news.doit.wisc.edu> Brad Koehn writes: >> temp_x = x; >> do_something(&temp_x); >> x = temp_x; > >Can't the compiler just check the code for do_something and see if it >changes temp_x? I realize it would be a real pain for compilers, but so >what? Yes and no. The problem is that C's separate compilation model calls for "compile a bunch of functions so that they all work correctly independently; at link time, hook them together." To make this work right, you need to defer code generation until linking. There are Ada development systems which do this (they need to generate extremely tight code for embedded systems). However, because of C's ubiquitous pointers, this trick only works for very simple functions. Figuring out what variables a given procedure call might modify, or even an approximation to it, in the presence of globals or structures containing pointers, is non-trivial. -- Anton Rang (rang@acm.org) +++++++++++++++++++++++++++ >From zstern@adobe.com (Zalman Stern) Date: Mon, 7 Mar 1994 02:50:34 GMT Organization: Adobe Systems Incorporated Anton Rang writes > Yes and no. The problem is that C's separate compilation model > calls for "compile a bunch of functions so that they all