james/yarr - yarr - gitea.jbrumond.me

james/yarr

mirror of https://github.com/nkanaev/yarr.git synced 2025-07-10 00:20:13 +00:00

Author	SHA1	Message	Date
Nazar Kanaev	2a4d974965	go fmt	2024-10-07 12:20:45 +01:00
Karol Kosek	b9b3d2350c	atom: Stop unescaping special HTML characters The HTML data in Atom is escaped because the data needs to put as a string to an XML file. If we are accessing it by reading the string value, then it is already unescaped, as opposed to getting the raw XML data. XHTML data don't need to be unescaped either since the elements are already encoded as is in tree. :) Closes #198	2024-06-16 11:35:32 +01:00
Will Harding	3adcddc70c	Pull atom xhtml title from nested elements The Atom spec says that any title marked with a type of "xhtml" should be contained in a div element[1] so we need to use the full XML text when extracting the text. [1] https://www.rfc-editor.org/rfc/rfc4287#section-3.1	2023-09-23 21:08:22 +01:00
Nazar Kanaev	850ce195a0	fix atom links	2023-09-07 18:19:17 +01:00
Nazar Kanaev	bc18557820	handle isPermalink in rss feeds	2023-05-20 23:26:22 +01:00
Pierre Prinetti	c1bcc0c517	Run go fmt This patch is the result of running `go fmt ./...` with Go v1.16.15.	2022-07-04 15:20:49 +01:00
Nazar Kanaev	ee2a825cf0	get rss link when atom link is present found in: https://rss.nytimes.com/services/xml/rss/nyt/Arts.xml when both rss and atom link elements are present, xml parser returns empty string. provide default namespace to capture rss link properly.	2022-05-03 15:35:57 +01:00
Nazar Kanaev	be7af0ccaf	handle invalid chars in non-utf8 xml	2022-02-14 15:23:55 +00:00
Nazar Kanaev	18221ef12d	use bytes.Buffer instead	2022-02-14 11:05:38 +00:00
Nazar Kanaev	d7253a60b8	strip out invalid xml characters	2022-02-12 23:42:44 +00:00
Nazar Kanaev	2de3ddff08	fix test	2022-02-12 23:41:01 +00:00
nkanaev	52cc8ecbbd	fix encoding	2022-01-24 16:47:32 +00:00
nkanaev	bff7476b58	refactoring	2022-01-24 12:50:52 +00:00
nkanaev	26b87dee98	remove html tags from titles	2021-11-10 10:54:12 +00:00
Karol Kosek	19ecfcd0bc	ParseRSS: accept any file with audio/ media type as podcast There are some podcasts that use audio/opus files (mostly as an alternative, but still), which makes the audio attachment not being displayed. Instead of increasing the list of allowed formats (because audio/mp3 would be quite useful on the list too), I guess it'd be better to give any audio/ media type to the user-agent and let him worry about it. :^)	2021-07-28 09:31:27 +01:00
Nazar Kanaev	d203d38de6	fix empty feed parsing	2021-07-01 14:10:22 +01:00
Nazar Kanaev	e54df07a40	use rdf description	2021-04-15 10:29:35 +01:00
Nazar Kanaev	f8455236dc	rdf date & content	2021-04-15 10:27:50 +01:00
Nazar Kanaev	fbb0dfed47	remove bom	2021-04-07 10:25:30 +01:00
Nazar Kanaev	144fc1606a	remove feed hacks from storage	2021-04-05 20:59:15 +01:00
Nazar Kanaev	fa2fad0ff6	cleanup	2021-04-05 10:01:20 +01:00
Nazar Kanaev	63ad971890	unsset audio/image if present in the content	2021-04-04 21:31:25 +01:00
Nazar Kanaev	0828d6782e	extract date parser to a new file	2021-04-04 20:45:13 +01:00
Nazar Kanaev	cf5856bdf7	set missing times	2021-04-04 20:42:52 +01:00
Nazar Kanaev	e50c7e1a51	handle html type atom text	2021-04-02 22:26:45 +01:00
Nazar Kanaev	0a0db68905	feedburner	2021-04-02 22:26:45 +01:00
Nazar Kanaev	36bc84d99a	increase lookup length	2021-04-02 22:26:44 +01:00
Nazar Kanaev	7dbfecdba1	extract thumbnails from vimeo feeds	2021-04-02 22:26:44 +01:00
Nazar Kanaev	fafa6286d4	parser fixes	2021-04-02 22:26:44 +01:00
Nazar Kanaev	cc51fe01c2	give priority to content:encoded	2021-04-02 22:26:44 +01:00
Nazar Kanaev	51cbdea31f	podcasts	2021-04-02 22:26:44 +01:00
Nazar Kanaev	6685bce51c	extract data from media elements	2021-04-02 22:26:44 +01:00
Nazar Kanaev	e0e6166cdf	fix feed sniff reader	2021-04-02 22:26:44 +01:00
Nazar Kanaev	c469749eaa	rename packaages	2021-04-02 22:26:44 +01:00

34 Commits