ElkArte Community

Project Support => General ElkArte discussions => Topic started by: Spuds on January 20, 2014, 10:24:43 am

Title: XML Hangovers
Post by: Spuds on January 20, 2014, 10:24:43 am
Since the code we adopted was at one point xhtml valid, we of course have a lot of xml style markup in it.  I think xml was good in that it was a more restrictive in what was valid markup, html5 however lets you do all matter of things and its still valid, not that I am saying lets do that.

The templates still validate to html5 with these items, but this is a question of what standards do we want to use in the markup, past the obvious stuff like lowercase tags etc.  Below are some of the items I'm specifically thinking about.

So things like
Title: Re: XML Hangovers
Post by: Nao on January 20, 2014, 03:52:21 pm
Did all that years ago. Never looked back ;)
Go for it, but it'll take you weeks to get it all right. I know I didn't expect it to take so long.
As for data, I learned years after that that I could remove them in HTML5. That, was easier ;)
Title: Re: XML Hangovers
Post by: emanuele on January 20, 2014, 04:22:56 pm
I've never been a big fan of html (TBH), always preferred xhtml, but I have no objections! ;)

Kill all of them with fire!!
Title: Re: XML Hangovers
Post by: TE on January 20, 2014, 11:42:05 pm
same here.. took me years to properly use the XML syntax and now I should go back? LOL.

PMs and messages need to be processed in some way, could do that via Importer....
Title: Re: XML Hangovers
Post by: Nao on January 21, 2014, 01:42:15 am
Html5 was already 'the thing to do' back in 2009-2010... we're in 2014 and it's time to do the right thing and make elkarte use less bandwidth... SMF will do it too, eventually, I'm sure Pete won't refuse to go html5 just because I did it that way in Wedge :P

Also, json >= xml but I'm not feeding another troll! ::)
Title: Re: XML Hangovers
Post by: TE on January 21, 2014, 02:10:45 am
Is it really sooo important to save some bytes? I don't believe it's more than just a few Megabytes in a month, even for a bigger sized board...

I personally prefer the XHTML5 syntax but I don't care that much  using one or the other for Elk, thus I'd bow to the majority.
Title: Re: XML Hangovers
Post by: Spuds on January 21, 2014, 09:57:22 am
I don't have any real issue with leaving things xml-ish, just wanted to see what others felt on the matter, so I / we could be consistent  :D  Kind of like how we format our php code, lots of options.  Sounds like we are happy with lowercase code, quoted attributes and trailing slashes on void elements. 

I may still remove the cdata stuff form the script tags, its not like we are serving the pages as xhtml, nor do I think we would again.  In fact from reading some the the processing deltas between html xhtml we use some tags that would cause problems, and some js that would also.
Title: Re: XML Hangovers
Post by: Joshua Dickerson on January 21, 2014, 10:57:15 am
I go with HTML5. Sometimes I'll use the slash and sometimes I'll do checked="checked". Either way, so long as it validates as HTML5, I don't waste time (unless I'm bored) making any changes.
Title: Re: XML Hangovers
Post by: emanuele on January 21, 2014, 02:59:55 pm
I was probably carried away by the "others?" and my comment to the xhtml-ihs was more related to actually close or not the tags (that is a possibility in html5), about the self-closing I'm neutral Spoiler (click to show/hide)

I'm not really neither in favour nor against.

Well, the checked="checked" and alike always bothered me a bit (sounds dumb).
Anything else... do what you prefer! O:-)
Title: Re: XML Hangovers
Post by: Spuds on January 21, 2014, 03:28:10 pm
Well its valid as is, and I am lazy .... humm what to do what to do :D
Title: Re: XML Hangovers
Post by: Nao on January 22, 2014, 06:15:55 am
Quote from: TE – Is it really sooo important to save some bytes?
Of course it's important. Isn't it important to you guys..?

QuoteI don't believe it's more than just a few Megabytes in a month, even for a bigger sized board...
And that few megabytes may cost you extra.
HTML size is important because it's not something that's of concern for the visitor OR the server -- it's of concern for both. If you're visiting from mobile, you'll be glad someone thought of making the pages you're visiting regularly lighter, so that you don't go over your data limit. If you're sending from a small-sized server, you'll be glad your website held out for an extra 30 seconds the day you got slashdotted. Oh, it all adds up. It's not 'just' the removal of bits here and there. It's a philosophy. If you keep at it, you'll get good results. I did a quick compare of two topic pages taken from Wedge and ElkArte. I removed all content to make it fairer to compare, of course. I also removed all code pertaining to admin and Wysiwyg quick reply (which isn't enabled here, and can be disabled there). The Elk page is 23KB, and the Wedge page is 16KB. Even after gzipping (Wedge isn't as good at gzipping because it's already optimized for size and doesn't have many repeated patterns), it saves about 400-500 bytes per page load. On a relatively low-traffic forum like ElkArte's or Wedge's, that makes for about 500KB to 1MB per day in wasted data. So, between 15 and 30MB per month. And that's only for the gzipped HTML. Then you get the CSS and JS, and other things like that. A bandwidth-conscious admin would be well advised to stay clear from SMF (or most forum software for that matter), but I certainly wouldn't like to say that Elk isn't bandwidth-conscious.
I'll admit that I went a bit overboard on bandwidth matters over the years (what you could call premature micro-optimization), and it often managed to upset others. But you don't have to be a bytenazi to recognize the virtues of saving space. Don't brush off my hints just because of my ODD behavior, there's certainly a solid middle ground somewhere in there. ;)

Oh, speaking of ODD...
data-icon="some entity;" (<-- it got turned into an actual entity by the editor, and I don't remember the number)
That's in unread_something (can't remember either), and I'm only pointing it out because it has a semi-colon, while other data-icon occurrences don't. It feels... uncomfortable. (Please save this byte. Just look for data-icon in your code and find the one with a semi-colon.)

QuoteI personally prefer the XHTML5 syntax but I don't care that much  using one or the other for Elk, thus I'd bow to the majority.
XHTML had its use many years ago, in the battle against tag soups and <,PEOPLE NOt knoZING='howTo..WRIte"code=butitwazalongtimeago.>
Still, respect.

Shout out, XHTML! You made our lives better for a while, but you're not my !DOCTYPE anymore.
Title: Re: XML Hangovers
Post by: TE on January 22, 2014, 06:43:25 am
QuoteOh, speaking of ODD...
data-icon="some entity;" (<-- it got turned into an actual entity by the editor, and I don't remember the number)
That's in unread_something (can't remember either), and I'm only pointing it out because it has a semi-colon, while other data-icon occurrences don't. It feels... uncomfortable. (Please save this byte. Just look for data-icon in your code and find the one with a semi-colon.)
thanks for the report, is already fixed in our repo :)
Title: Re: XML Hangovers
Post by: Spuds on January 22, 2014, 10:07:23 am
QuoteOh, speaking of ODD...
data-icon="some entity;"
Woops that was ME  :D I'm just a lemming of the validator !

Agree on working to make the pages lower in weight, we still have extra heft in the markup, and still some in line style stuff to sniff out and on some lesser used pages blocks of inline script, etc, etc, lots of opportunity :D  When I think about the amount of that we have already removed, its staggering really
Title: Re: XML Hangovers
Post by: Nao on January 22, 2014, 10:17:30 am
It certainly does feel faster than SMF2 already, although it's hard to compare apples and oranges (was that the proper expression?) when they're on two different servers.
Title: Re: XML Hangovers
Post by: Joshua Dickerson on January 22, 2014, 11:06:56 am
If any admin is worried about 100 bytes per page, they are ridiculous. An image alone is many KB. If they're concerned with it, I say let them rewrite all of the themes. This is 2014, not 1976; the bandwidth of any hosting plan is of a large enough size and capacity that it doesn't matter.

Just do what makes development easier. If you worry about minutia, you'll turn off developers. When you turn off developers, you become stagnant. When you become stagnant, you don't get or don't retain users.
Title: Re: XML Hangovers
Post by: kucing on January 24, 2014, 12:09:16 am
Well.. depend on the scale really. http://chrishateswriting.com/post/68794699432/small-things-add-up

Title: Re: XML Hangovers
Post by: Joshua Dickerson on January 24, 2014, 08:04:04 am
No, not really... "500 million pageviews per month adds up to 46 terabytes per month" if 46 TB is a lot to Yahoo! today, they have problems.
Title: Re: XML Hangovers
Post by: Nao on January 29, 2014, 02:06:36 pm
I did a test yesterday enabling $settings['minify_html'] (a Wedge special) on my forum, and for a 11KB (gzipped size) page, I'm saving over 200 bytes. It's not something to be ignored, and this is free -- the only thing it does is remove tabs that start any line! But I like indentation, so I guess I'll have to live with the 200 extra bytes... Lol.
Title: Re: XML Hangovers
Post by: Joshua Dickerson on January 29, 2014, 06:10:04 pm
That's a savings of 1.8%. The 304 response to check if your avatar is old was 509 bytes and its full size is 24.4KB, according to Chrome DevTools. 200 bytes means nothing.
Title: Re: XML Hangovers
Post by: Nao on January 30, 2014, 04:21:31 pm
It's with this kind of thinking that the web will end up with an average web page size of over 1MB...

...

Oh wait, scratch that, it's already done (http://www.webperformancetoday.com/2013/06/05/web-page-growth-2010-2013/).  ::)
Title: Re: XML Hangovers
Post by: Joshua Dickerson on January 31, 2014, 10:11:21 am
Did you even look at that chart? Images are over 400KB whereas the HTML is next to nothing. The "other" field, which I'm guessing is streaming media (video), accounts for the biggest percentage change. Read the "Despite all this growth, is the internet getting faster?" title. Still, if the internet connection speed is increasing, does that even matter? http://thenextweb.com/insider/2013/07/23/akamai-average-internet-speed-up-17-year-over-year-to-finally-pass-3-mbps-while-mobile-data-traffic-doubled

You're trying optimize text on a page when people are streaming videos from the same page. I used to be the biggest micro-optimization idiot around. I learned that the most important thing is the speed of development. If you don't develop, you won't have users. You can't develop without making it nice for developers. Worrying about minutia (like coding guidelines) is not what developers want to worry about. If a developer wants to spend his or her time going through the code to make it look better or perform better, good for them, but any pressure to make other developers to do the same will turn them off.
Title: Re: XML Hangovers
Post by: Nao on January 31, 2014, 05:01:45 pm
I'm not even gonna dignify that with an answer... :P

QuoteImages are over 400KB whereas the HTML is next to nothing.
Sure. But Wedge also optimizes uploaded images. Even Aeva Media ensures the size of previews is the smallest possible for the best quality.

QuoteThe "other" field, which I'm guessing is streaming media (video), accounts for the biggest percentage change. Read the "Despite all this growth, is the internet getting faster?" title.
Just because my connection is 8MBps (and I suppose for anyone living in Paris it's ridiculously slow) doesn't mean everyone is as lucky as I am. Most developing countries have slow connections, and while a 1MB page will still load quickly for them, it's not something to be ignored.
More importantly, mobile phones have limited capabilities. Not everyone has a Galaxy S3 like I do. And I find it very slow on today's web pages. Even the S4 is just 'okay' for web browsing. Last time I tried browsing with my old iPod Touch 4, I was about to cry. It was just so excruciatingly slow.

And this is all down to the fact that images and videos are okay, but large SCRIPTS aren't. How often do you find a news site that has tons of animated banners? One of those nerve-wracking popups asking for your opinion on the website when you haven't even had the opportunity to read a line of the article? And one of those wonderful little snippets that offer to "Share this boring article on Facebook, Reddit, Twitter, EatMyCrap, and 848 other websites we've never been to but if you're using them, you MUST share on them!"...?
On my mobile phone, everytime I'm getting some news, I'm dreading that I'll be redirected to one of the many sites that make my life miserable. These sites, they simply don't CARE about how much bandwidth they use, because usually they include external scripts, Flash files and so on. They don't realize they're not getting new followers because they're trying so hard to pack every monetizing trick into a single page.

jQuery inclusion has got more of a trivial impact on bandwidth, because chances are you'll be getting a version that's already cached in your browser. But apart from that one... Hmm.
Also, not only these scripts take time to load, but they'll often force a re-layout, thereby disrupting your reading experience right in the middle of it. Take into account the fact that mobile browsers are slow to execute JavaScript, and it's just something that doesn't work for me.

That's one of the many reasons why I think it's not 'dirty' to talk about micro-optimizations.

QuoteI learned that the most important thing is the speed of development. If you don't develop, you won't have users. You can't develop without making it nice for developers.
Well... I'm afraid you can.

QuoteWorrying about minutia (like coding guidelines) is not what developers want to worry about. If a developer wants to spend his or her time going through the code to make it look better or perform better, good for them, but any pressure to make other developers to do the same will turn them off.
Or, alternatively, you could accept user submissions, and just do a quick refactoring to make it fit your guidelines.