?

Log in

No account? Create an account

lj_dev: RSS, XML, Encodings, fun

Brad Fitzpatrick (bradfitz) wrote in lj_dev,
@ 2001-11-03 00:03:00
ARRAY(0x7f0cabaa6b80)
RSS, XML, Encodings, fun
I added RSS support today (example) because somebody asked for it recently.

It was really easy, but right now we're spitting out bad XML on journals that aren't in UTF-8 or a subset.

I'm fixing this by doing everything properly with XML::DOM and Unicode::MapUTF8 ... both great modules.

We have a 'lang' field in the user table. We'll also need a default encoding userprop probably. We need to expose that and the language field, then.

And we should modify the protocol to let it take an encoding to convert from. Internally we'll store all data as utf-8.

And if we detect a charset encoding with an HTTP POST, we'll do the conversion automatically. Still have to look into how that works (which HTTP request headers are sent...).

Just wanted to make it known that I really do want LiveJournal to be smart about encodings and non-English languages. It's just slow going doing so much at once.

Anybody interested in working on this? If you need help I can guide you.


(31 comments) - (Post a new comment)


gleepy
2001-11-03 12:23 am UTC (link)
RSS is good. (Think of products like Amphetadesk benefitting from making a quick look at LiveJournal entries.)

(Reply) (Thread)


gleepy
2001-11-03 12:30 am UTC (link)
Hey, I added your link to AmphetaDesk v0.91. Looks nice.

(Reply) (Parent)

RSS syndication
momokatte
2001-11-03 09:17 am UTC (link)
According to this support request, <URL> should be <link>.

<url> belongs in the <image> sub-element, and contains the URL of a GIF, JPEG or PNG image that represents the channel.

(Reply) (Thread)

Re: RSS syndication
momokatte
2001-11-03 09:23 am UTC (link)
Nevermind, I just looked at the spec and your implementation appears to be correct.

(Reply) (Parent) (Thread) (Expand)

Re: RSS syndication - momokatte, 2001-11-03 09:25 am UTC (Expand)

way2tired
2001-11-03 12:00 pm UTC (link)
Ok, so what do we use this for? Is it just to spit out a bunch of links, or is there more to it?

(Reply) (Thread) (Expand)


opiummmm
2001-11-03 11:29 pm UTC (link)
RSS is basically a syndication protocol based on XML. It has its uses, plenty of which I'm sure I'm not thinking of, but it seems that its most popular for news-ticker type programs.

(Reply) (Parent) (Thread) (Expand)

(no subject) - opiummmm, 2001-11-03 11:30 pm UTC (Expand)
(no subject) - twistah, 2001-11-04 02:42 pm UTC (Expand)

avva
2001-11-03 07:06 pm UTC (link)
Anybody interested in working on this?

I am. I used a hammer to free some reasonable chunks of time to do some lj_dev stuff, finally. Will you have me back? ;)

(Reply) (Thread)


bradfitz
2001-11-03 09:48 pm UTC (link)
please! :)

(Reply) (Parent)

RSS for friends pages?
markpasc
2001-11-04 12:00 pm UTC (link)
What about RSS versions of friends pages? All the journals I'd like to get in RSS are already aggregated into my friends page. Rather than make N HTTP requests for my N friends' RSS files, I could make one if my friends page were available in RSS.

(Reply) (Thread)

Re: RSS for friends pages?
bradfitz
2001-11-04 12:16 pm UTC (link)
True.

I'll try to get somebody to add this, or I'll do it myself when I got a minute.

(Reply) (Parent) (Thread) (Expand)

sorry... - gomolyako, 2004-09-01 08:42 am UTC (Expand)
3+ years later - everdred, 2005-01-10 10:01 pm UTC (Expand)

insomnia
2001-11-04 12:09 pm UTC (link)
Just a check on this... the RSS feeds only list public posts, right? No private or friends-only?

(Reply) (Thread)


bradfitz
2001-11-04 12:14 pm UTC (link)
yup

(Reply) (Parent)


twistah
2001-11-04 04:18 pm UTC (link)
Just as a note about non-UTF8 stuff, I looked at avva's journal (via RSS) and Internet Explorer 5.5 spit out an XSL error, but when I plugged the URL into FeedReader (a Windows app which reads RSS feeds), everything "showed up" -- the charachters were garbled, but that is probably because I don't have some Russian/Cyrillic supprort installed. But come to think of it, Windows 2000 comes with all languages supported by default (AFAIK) so my theory is probably wrong...

(Reply) (Thread) (Expand)


bradfitz
2001-11-04 04:24 pm UTC (link)
IE 5.5 is correct. Your theory is wrong.

XML by default is interpretted as UTF-8. His posts are in Windows-1251 (Cryillic). When the XML parser hits his characters above 127, it tries to unpack them as Unicode characters and fails.

We need to convert his code page to UTF-8. There is a perl module to do this (Unicode::MapUTF8) but first we need to tell it the source encoding.

(Reply) (Parent)

(no subject) - bradfitz, 2001-11-04 04:25 pm UTC (Expand)

mart
2001-11-05 06:31 am UTC (link)

We need a lang field in log and possibly talk too, since people sometimes write in languages other than their primary language. The interface to this is of course a pain, but at least if the field is there we can find some wonderful way of having the user specify it which is user-friendly.

Besides, it'd be cool if the HTML output on a journal view could do <something lang="es"> around the entries where they differ from that set in the user table...

(Reply) (Thread) (Expand)


bradfitz
2001-11-05 07:02 am UTC (link)
We're 10 steps ahead of ya, yo. :)

If we convert everything to UTF-8 we can simply mix every encoding all on one page, as UTF-8 encompasses all code pages.

(Reply) (Parent)

(no subject) - bradfitz, 2001-11-05 07:32 am UTC (Expand)
Support [description]'s?
morbus
2001-11-05 07:44 pm UTC (link)
Hey there - I'd like to throw in a vote for [description] tags. I read about 10 to 15 LJ's a day, and it would be great to load them all up in AmphetaDesk (I'm the creator of AmphetaDesk - see it here: http://www.disobey.com/amphetadesk/) and read the entire post (with HTML) and then just click to comment on the ones that interest me. What are your thoughts?

(Reply) (Thread)

Re: Support [description]'s?
bradfitz
2001-11-21 10:04 am UTC (link)
I got a patch recently to allow that. I'll get it in soon.

(Reply) (Parent) (Thread) (Expand)

Re: Support [description]'s? - voidstar, 2001-11-25 07:08 am UTC (Expand)
Re: Support [description]'s? - voidstar, 2001-12-16 01:52 am UTC (Expand)
What about exporting your interests?
wkearney
2001-11-21 06:58 am UTC (link)
Hi,

I'm working with a bunch of folks over in the Syndic8 mailing list to develop an extension to RSS that supports categories. The nearest equivalent I can see in LJ is either a Topic or the interest keywords.

What are your thoughts on including that information with the feed and/or with each item?

You're all welcome to read/join the syndic8 list. We'd welcome the input.
http://groups.yahoo.com/group/syndic8

Thanks,
Bill Kearney

(Reply) (Thread)

Re: What about exporting your interests?
bradfitz
2001-11-21 10:05 am UTC (link)
That'd be cool.

Not sure how useful it'd be, though ... how would an RSS consumer present it?

(Reply) (Parent) (Thread) (Expand)

What's new with LJ and RSS ?
quercus
2002-06-04 06:15 am UTC (link)
Anything happening in the LJ / RSS world ? (I'm new to LJ)

Anyone interested in working on RSS 1.0, instead of 0.91 ?

(Reply)


(31 comments) - (Post a new comment)