Access issues by domain names (Atom feed) #132

karlcow · 2014-06-03T23:21:42Z

miketaylr · 2015-09-30T17:21:32Z

This is our RSS feature.

hallvors · 2015-10-29T16:20:23Z

#788 will help

miketaylr · 2016-10-17T08:04:23Z

No published branches yet, but @karlcow has a prototype in progress on his laptop. Assigning to him.

miketaylr · 2016-10-17T08:15:11Z

Closing #60 (comment) as a dupe of this.

karlcow · 2017-03-30T05:17:12Z

Preserving some things I had done for #60 So I can delete my local branch.

for webcompat/views.py

@app.route('/feeds/<domain_name>')
def domain_feed(domain_name):
      '''Route to display a feed for a domain name.

      - domain_name would be `mozilla.org`.
      - should make a search of all titles, numbers, latest comment date
      '''
      # User is probably not necessary here.
      if g.user:
          get_user_info()
      # Searching the domain_name and return a JSON with relevant data
      # to be defined in helpers
      domain_data = feed_summary(domain_name)
      return render_template('feed.atom', domain_data)

karlcow · 2017-03-30T05:25:18Z

Made the first comment more descriptive with the list of things to do.

karlcow · 2017-05-17T22:41:38Z

Note to self (it will grow with time):

There are a couple of ways to do that and to explore. I need to explore the impact about choices for them and the likelihood of creating a performance impact on the application.

Some possibilities:

Generate the feed through a search query each time there is a request for the feed.
- Pro: Information always fresh
- Con: A search query is created at each request. Even with caching information, feed reader apps are not very respectful of HTTP best practices. So they will hit the server every time. That might exhaust our search rate limit.
Generate a static feed at first request once an hour or once a day. Deliver the static file for each subsequent request. There might even be Flask extension doing already that. to search
- Pro: Cache/Performance friendly. We have in cache only the domain name feeds which have been requested and not all domains names.
- Con: Information age == defined by the cache we are creating (1 day old for example)
Generate once a day with a cron, feeds for every known domains we currently have on webcompat.com
- Pro: Cache/Performance friendly.
- Con: same issues than 2. and having useless feeds kept around.

Some additional issues:

domain names are not always the source of the issues. Think about a website having disqus comments. The real issue is on disqus which is different from the domain name which has been reported.
Changing URIs in the bug report after analysis.
Domain names of the same family (music.yandex.ru and radio.yandex.ru) or the blogspot.* (yes blogspot is addressing all country TLDs).
Information relevant in the feed has to be determined.
- domain name
- URL
- Steps to reproduce
- Screenshot?
- How the feed <item> evolves with time.
  - Do we advertise the change of status?
  - Do we keep it once it is closed? Or do we remove it?
  - Do we advertise comments about the issue when there's a new one? Content comment or just the link to the comment?
  - Do we individual items for each new event or just an item we refresh and change with new information?

Some possible dependencies/information:

Creating a daily DB or flat file dump of all issues could be very useful in all scenarios. It would remove the need for relying on search queries.
There is no existing feed for issues on a project https://developer.github.com/v3/activity/feeds/
A very practical tool that might become handy https://api.github.com/repos/webcompat/web-bugs/issues/events This is documented in https://developer.github.com/v3/issues/events/#list-events-for-a-repository

karlcow · 2017-05-18T08:02:42Z

Ahaha. Brace for impact and its controversy.
https://twitter.com/search?f=tweets&vertical=default&q=http%3A%2F%2Fjsonfeed.org%2F&src=typd

- pep257 - orders of import - ignore webcompat.views check

- Adds a /feed Blueprint - Prepares for the main request feed function

karlcow · 2017-08-16T05:39:47Z

Let's start the experiment. Code! 🚨
And we will see if we have to throw everything. 🗑

karlcow · 2017-08-16T05:43:55Z

I created a Blueprint for /feeds
and added a couple of tests

The main thing I will be experimenting is the creation of static files either generated on first requests or based on a cron.

- Fixes tests for feeds/ home page - Creates shells for prose - Defines routes for feeds/

- Handles non existent domain names. - Creates a helper file for all things strictly related to feeds - Adjusts test for the right routes and content

karlcow · 2017-08-17T05:53:19Z

There is a feed feature in Werkzeug to keep in mind.
http://flask.pocoo.org/snippets/10/

karlcow · 2017-08-17T06:33:43Z

Dumping ideas. Notebook style. 📓

🐍 pseudo-code

@feeds.route('/<domain>', methods=['GET'])
def domain_feed(domain):
    """Serve a feed for a specific domain name."""
    # Have we handled this domain already?
    if is_known_domain(domain):
        # Do we have a static atom feed file for it?
        if not is_static_feed(domain):
            # Let's create the feed in data/feed/
            create_feed(domain)
        # we can serve the feed to users.
        return serve_domain_feed(domain)
    else:
        # if we don't know anything we return 404
        return (
            '{domain} has no feed'.format(domain=domain),
            404,
            {'Content-Type': 'text/plain'})

I want to minimize the impact of bad feed readers. No matter how much caching you set on feed resources, many feed readers ignore it, and request every couple of minutes. So to avoid to generate a feed each time, I want to serve a static file that we generated on the first request.

Another benefit is we get files only for domains that people are interested in.

An interesting question will come up with updating, but let's say it's an issue we have to deal with later.

Some issues 🚨

data quality. For example, the domain is not always the domain of the issue. I can think of marfeel issues, I can think of issues recently reported for youtube iframe which are on plenty of domain names.
Managing duplicates so we do not have a feed with 10s of the same issue.
Defining the type of information which would be useful for domain owners. Is it the change of status which is interesting or the new comments. These the most interesting and challenging part, which ties to which data we have and what can we share that will be useful for others.

karlcow · 2017-08-30T06:15:22Z

data quality is interesting…
From a dump I have of all the issues from July 2017. Around 7920 issues. the domain names… are not always here or bogus or with irregular patterns.

I found so far 370 issues with bogus domains.
I tried to cover as much as possible the possible patterns.

title is the issue title, so something ala www.nytimes.com - desktop site instead of mobile site

Current version. Will evolve.

def extract_domain_name(title, issue_number):
    """Extract the domain name from the title string."""
    # a domain name doesn't contain space
    candidate = title.split(' ', 1)[0]
    # domain names are lower cases
    candidate = candidate.lower()
    # a domain name contains at least one "."
    if '.' not in candidate:
        return 'BOGUS', issue_number
    # Tuple of bogus pattern to check against
    bogus_start_patterns = ('resource://', 'file://', 'chrome://')
    if candidate.startswith(bogus_start_patterns):
        return 'BOGUS', issue_number
    # it contains a domain name.
    if candidate.startswith('view-source:'):
        candidate = candidate.split('view-source:')[1]
    if ':' in candidate and not candidate.startswith('http'):
        candidate = candidate.split(':')[0]
        candidate = 'http://{}'.format(candidate)
    # some issues starts with http, we will clean up.
    if candidate.startswith('http://') or candidate.startswith('https://'):
        candidate = urlparse.urlsplit(candidate).netloc
        candidate = candidate.split(':')[0]
    # some domains with a path
    if '/' in candidate:
        candidate = candidate.split('/')[0]
    # some bogus domain with &
    if '&' in candidate:
        candidate = candidate.split('&')[0]
    # Handling local domains
    local_patterns = ('10.', '127.0.0.1', '192.168.', '172.')
    if candidate.startswith(local_patterns):
        return 'BOGUS', issue_number
    # return issue_number, candidate.encode('utf-8')
    # return candidate.encode('utf-8'), title
    return candidate.encode('utf-8')

Some of them that I'm fixing on the fly have an opportunity to be fixed once and for all.
I could spill out a FIXME for those, so the data quality improves for the next run.

I will re-run it soon with a fresh issue dump.

There are still some issues where the domain name is different from URL: in the body.
I can probably create an additional check to extract these and compare.

This is just for dumping a DB of domain names to generate feeds, but could be ultimately reuse in normalizing the data we receive from people.

7920 domain names (July 2017)
7550 valid domain names, aka 95% (minus all the small things I missed)
370 bogus domain names (in issue title)
4141 unique domain names, aka 55% (4141 potential feeds)
top 50?

 259 www.youtube.com
 125 www.facebook.com
 112 www.google.com
 110 vk.com
  76 www.netflix.com
  76 m.youtube.com
  65 web.whatsapp.com
  62 m.facebook.com
  55 webcompat.com
  53 addons.mozilla.org
  47 www.coco.fr
  40 twitter.com
  34 music.yandex.ru
  33 www.mozilla.org
  33 s0.2mdn.net
  32 www.twitch.tv
  32 support.mozilla.org
  32 mail.google.com
  27 github.com
  21 www.reddit.com
  20 www.amazon.com
  20 mega.nz
  19 www.pandora.com
  19 www.amazon.in
  19 play.google.com
  18 www.amazon.de
  18 mobile.twitter.com
  17 www.amazon.co.jp
  17 apps.facebook.com
  16 www.hulu.com
  16 www.bing.com
  15 www.primevideo.com
  15 www.linkedin.com
  15 radio.garden
  15 accounts.google.com
  14 www.yahoo.com
  14 outlook.live.com
  13 mailmanager.cityweb.de
  13 inbox.google.com
  13 imgur.com
  13 docs.google.com
  13 chaturbate.com
  12 www.theverge.com
  12 www.nasa.gov
  12 www.google.co.in
  12 video.corriere.it
  12 sj.myie9.com
  12 g1.globo.com
  12 drive.google.com
  12 developer.apple.com

Google properties:

 112 www.google.com
  32 mail.google.com
  19 play.google.com
  15 accounts.google.com
  13 inbox.google.com
  13 docs.google.com
  12 www.google.co.in
  12 drive.google.com
  11 google.com
   7 www.google.ca
   7 news.google.com
   5 www.google.ro
   5 www.google.fr
   5 support.google.com
   5 images.google.com
   4 www.google.com.mx
   4 www.google.co.uk
   4 translate.google.com
   4 tpc.googlesyndication.com
   4 plus.google.com
   4 hangouts.google.com
   4 fonts.google.com
   3 www.google.se
   3 www.google.it
   3 www.google.com.br
   3 www.google.co.jp
   3 groups.google.com
   3 developers.google.com
   2 www.googleadservices.com
   2 www.google.ru
   2 www.google.de
   2 www.google.com.pk
   2 www.google.com.eg
   2 voice.google.com
   2 santatracker.google.com
   2 photos.google.com
   2 news.google.co.in
   2 keep.google.com
   2 insideabbeyroad.withgoogle.com
   2 gmail.google.com
   2 calendar.google.com
   1 www.google.sk
   1 www.google.pt
   1 www.google.me
   1 www.google.hu
   1 www.google.es
   1 www.google.com.vn
   1 www.google.com.ua
   1 www.google.com.sa
   1 www.google.com.co
   1 www.google.com.bd
   1 www.google.co.th
   1 www.google.co.id
   1 www.google.ch
   1 www.google.bg
   1 www.drive.google.com
   1 trends.google.com
   1 translate.googleusercontent.com
   1 translate.google.ro
   1 translate.google.co.kr
   1 testmysite.thinkwithgoogle.com
   1 svg-edit.googlecode.com
   1 streetart.withgoogle.com
   1 storage.googleapis.com
   1 sites.google.com
   1 scholar.google.com
   1 r4---sn-4g5edn7s.googlevideo.com
   1 r3---sn-gwpa-itqd.googlevideo.com
   1 r2---sn-4g5edned.googlevideo.com
   1 productforums.google.com
   1 privacy.google.com
   1 opensource.google.com
   1 news.google.com.tw
   1 news.google.com.br
   1 myaccount.google.com
   1 googleweblight.com
   1 google.co.in
   1 enterprise.google.com
   1 encrypted.google.com
   1 earth.google.com
   1 console.cloud.google.com
   1 com.google
   1 codelabs.developers.google.com
   1 chrome.google.com
   1 books.google.de
   1 books.google.ca
   1 apps.google.com
   1 analytics.googleblog.com

karlcow · 2017-08-30T06:19:13Z

Do we create a feed when there is no valid issue associated with this domain?

karlcow · 2017-08-30T07:22:26Z

Ah … crap…

Once the BOGUS title removed, we have quite a lot of differences in between titles and URL. And a lot of recent issues. That comes from softvision not entering the same domain for the title and the URL. I think they fixed it after I mentioned it, but I didn't realize we had so many bad ones.

I need to fix this. Automatically if I prepare well the data. 😭

Below (issue_number, title_domain, URL_domain)

(1005, 'jal.co.jp', 'sp5971.jal.co.jp')
(1052, 'excite.co.jp', 'a.excite.co.jp')
(1053, 'excite.co.jp', 'a.excite.co.jp')
(1083, 'btv.cat', 'www.btv.cat')
(110, 'webcrawler.com', 'www.webcrawler.com')
(1139, 'lastampa.it', '')
(1145, 'smo.suumo.jp', 'smp.suumo.jp')
(1161, 'bosch-home.pl', 'www.bosch-home.pl')
(1182, 'video.gazzetta.it', '')
(1183, 'video.gazzetta.it', '')
(1184, 'sportmediaset.mediaset.it', '')
(1185, 'video.corriere.it', '')
(1242, 'menshealth.com', 'www.menshealth.com')
(1257, 'menshealth.com', 'www.menshealth.com')
(1267, 'womenshealthmag.com', 'www.womenshealthmag.com')
(1285, 'm.facebook.com', 'spam-removed')
(1301, 'menshealth.com', 'www.menshealth.com')
(139, 'moleskine.com', 'www.moleskine.com')
(1409, 'webcompat.com', 'support.mozilla.org')
(141, 'virginamerica.com', 'www.virginamerica.com')
(1528, 'docs.google.com', 'goo.gl')
(1591, 'www.facebook.com', 'spam-removed')
(1592, 'www.facebook.com', 'spam-removed')
(1593, 'www.facebook.com', 'spam-removed')
(1595, 'www.facebook.com', 'spam-removed')
(1596, 'www.facebook.com', 'spam-removed')
(1597, 'www.facebook.com', 'spam-removed')
(1598, 'www.facebook.com', 'spam-removed')
(1601, 'www.facebook.com', 'spam-removed')
(1602, 'www.facebook.com', 'spam-removed')
(1603, 'www.facebook.com', 'spam-removed')
(1604, 'www.facebook.com', 'spam-removed')
(1605, 'www.facebook.com', 'spam-removed')
(1611, 'www.facebook.com', 'spam-removed')
(1612, 'www.facebook.com', 'spam-removed')
(1614, 'www.facebook.com', 'spam-removed')
(1616, 'www.facebook.com', 'spam-removed')
(1617, 'www.facebook.com', 'spam-removed')
(1652, 'm.facebook.com', '')
(1687, 'www.fb.com', 'spam-removed')
(1688, 'm.fb.com', '')
(1689, 'www.fb.com', 'spam-removed')
(1690, 'm.fb.com', 'spam-removed')
(1691, 'www.fb.com', 'spam-removed')
(1692, 'www.fb.com', 'spam-removed')
(174, 'www.jetblue.com', 'jetblue.com')
(1807, 'amazon.com', 'https:')
(1850, 'mozillafestival.org', '2015.mozillafestival.org')
(1917, 'www.flipkart.com', 'www')
(1995, '8888.186tcye.pw', '')
(20, 'crosswalkdp.com', 'www.crosswalkdp.com')
(2001, 'glasses.com', 'www.glasses.com')
(2007, 'bioskop21.id', '')
(2008, 'bioskop21.id', '')
(2017, 'appinstallsmobi.com', '')
(2019, 'webcompat.com', 'www.6666hh.com')
(207, 'nfl.com', 'www.nfl.com')
(2107, 'allindiaradio.govt.in', 'allindiaradio.gov.in')
(2181, 'm.facebook.com', '')
(2232, 'm.facebook.com', '')
(2240, 'yuku.com', 'www.yuku.com')
(2243, 'www.sz-runxin.com', '')
(2314, 'saa.qualtrics.com', '')
(2396, 'hotmoza.com', '')
(2433, 'www.marcoborla.it', '')
(2476, 'barbershop.org', 'ebiz.barbershop.org')
(2498, 'video.js', 'github.com')
(2499, 'discovery.com', 'www.discovery.com')
(2502, '1g22.com', '')
(2503, '1g22.com', '')
(2505, 'jornada.una.mx', 'www.jornada.unam.mx')
(2740, 'oneviewcalendar.com', 'www.oneviewcalendar.com')
(28, 'webcompat.com', 'github.com')
(2822, 'dragon8.troyhero.com', '')
(2823, '546r.com', '')
(2884, 'www.tangerine.ca', 'secure.tangerine.ca')
(2891, 'www.luludai.cc', '')
(3, 'volcanicpixels.com', 'www.volcanicpixels.com')
(3066, 'chromestatus.com', 'www.chromestatus.com')
(3146, 'mobile22.gameassists.co.uk', '`http')
(3464, 'codepen.io', '')
(3623, 'largepenissociety.tumblr.com', 'large*society.tumblr.com')
(372, 'cbc.ca', '')
(3835, 'outlook.live.com', '')
(385, 'pch.sweeps.com', 'pch sweeps.com')
(399, 'www.hwbank.it,', 'www.hwbank.it, www.netxhs.it')
(4119, 'www.', 'www. webcompat.com')
(45, 'http.req.url.http_url_safe', 'www.ibm.com')
(4729, 'm.weibo.cn', 'm.weibo.cn -  swipe gesture issue')
(490, 'm.spiegel.de', 'm.spiegel.de   or   spiegel.de')
(4979, 'www.facebook.com', '')
(4987, 'answers.yahoo.com', '')
(5007, 'www.reddit.com', '')
(5008, 'www.reddit.com', '')
(5009, 'www.reddit.com', '')
(5011, 'www.reddit.com', '')
(5012, 'www.twitter.com', '')
(5070, 'www.linkedin.com', '')
(5073, 'www.linkedin.com', '')
(51, 'expedia.co.jp', 'www.expedia.co.jp')
(521, 'okcupid.com', 'www.okcupid.com')
(53, 'nascarwagers.com', 'www.nascarwagers.com')
(54, 'xvideos.com', 'www.xvideos.com')
(5488, 'www.xvideos.com', '')
(5489, 'www.indeed.com', 'indeed.com')
(5509, 'www.spotify.com', 'open.spotify.com')
(5566, 'www.bestbuy.com', 'www.bestbuy-jobs.com')
(5568, 'www.bestbuy.com', 'www.bestbuy-jobs.com')
(5573, 'www.deals.bestbuy.com', 'deals.bestbuy.com')
(5589, 'www.baidu.com.com', 'goo.gl')
(5591, 'www.baidu.com', 'music.baidu.com')
(5592, 'www.baidu.com', 'voice.baidu.com')
(5593, 'www.baidu.com', 'voice.baidu.com')
(5602, 'www.baidu.com', 'goo.gl')
(5604, 'www.disney.com', 'm.disneystore.com')
(5605, 'www.disney.com', 'm.disneystore.com')
(5654, 'www.homedepot.com', 'm.homedepot.com')
(5656, 'www.homedepot.com', 'm.homedepot.com')
(57, 'momondo.com', 'm.momondo.com')
(59, 'independent.co.uk', 'www.independent.co.uk')
(5906, 'www.rumble.com', 'rumble.com')
(5910, 'www.rumble.com', 'rumble.com')
(5936, 'm.privacy2browsing.com', '[removed]')
(5949, 'www.rumble.com', 'rumble.com')
(5953, 'www.rumble.com', 'rumble.com')
(5957, 'www.rumble.com', 'rumble.com')
(6016, 'www.gomovies.to', 'gomovies.to')
(6021, 'www.gomovies.to', 'gomovies.to')
(6023, 'www.gomovies.to', 'gomovies.to')
(6049, 'www.gomovies.to', 'gomovies.to')
(6141, 'youtube.com', 'www.youtube.com')
(617, 'grammarly.com', '')
(6183, 'www.citi.com', 'online.citi.com')
(6184, 'www.citi.com', 'www.privatebank.citibank.com')
(6186, 'www.citi.com', 'www.privatebank.citibank.com')
(6195, 'www.businessinsider.com', 'intelligence.businessinsider.com')
(6216, 'www.wikipedia.org', 'goo.gl')
(6217, 'www.wikipedia.org', 'en.m.wikipedia.org')
(6218, 'www.wikipedia.org', 'en.m.wikipedia.org')
(6219, 'www.wikipedia.org', 'en.m.wikipedia.org')
(6248, 'www.wikipedia.org', 'en.m.wikivoyage.org')
(6250, 'www.wikipedia.org', 'm.mediawiki.org')
(6254, 'www.yahoo.com', 'fr.yahoo.com')
(6255, 'www.yahoo.com', 'research.yahoo.com')
(6256, 'www.yahoo.com', 'research.yahoo.com')
(6397, 'www.yahoo.com', 'fr.yahoo.com')
(6398, 'www.yahoo.com', 'login.yahoo.com')
(6400, 'www.yahoo.com', 'fr.sports.yahoo.com')
(6402, 'www.yahoo.com', 'fr.finance.yahoo.com')
(6403, 'www.yahoo.com', 'fr.finance.yahoo.com')
(6411, 'www.lemonde.fr', 'abo.lemonde.fr')
(6412, 'www.lemonde.fr', 'secure.lemonde.fr')
(6435, 'www.lemonde.fr', 'moncompte.lemonde.fr')
(644, 'inbox.google.com', '')
(6447, 'www.ebay.fr', 'm.ebay.fr')
(6463, 'www.ebay.fr', 'csr.ebay.fr')
(6471, 'www.ebay.fr', 'csr.ebay.fr')
(6475, 'www.ebay.fr', 'm.ebay.fr')
(6477, 'www.ebay.fr', 'm.ebay.fr')
(6499, 'www.allocine.fr', 'secure.allocine.fr')
(6567, 'www.sfr.fr', 'assistance.sfr.fr')
(6576, 'www.lequipe.fr', 'm.lequipe.fr')
(6577, 'www.ebay.fr', 'csr.ebay.fr')
(6584, 'youtube.com', 'www.youtube.com')
(6593, 'www.lequipe.fr', 'm.lequipe.fr')
(6595, 'www.lequipe.fr', 'm.lequipe.fr')
(6628, 'www.aliexpress.com', 'm.fr.aliexpress.com')
(6629, 'www.aliexpress.com', 'm.fr.aliexpress.com')
(6633, 'www.aliexpress.com', 'm.fr.aliexpress.com')
(6656, 'www.aliexpress.com', 'm.fr.aliexpress.com')
(6658, 'www.aliexpress.com', 'm.fr.aliexpress.com')
(666, 'mint.com', 'javascript')
(6663, 'www.aliexpress.com', 'm.fr.aliexpress.com')
(6668, 'www.tumblr.com', 'goo.gl')
(6725, 'www.stackoverflow.com', 'stackoverflow.com')
(6728, 'disqus.com', 'stackoverflow.blog')
(6786, 'www.bfmtv.com', 'rmc.bfmtv.com')
(6789, 'www.leparisien.fr', 'm.leparisien.fr')
(6790, 'www.leparisien.fr', 'm.leparisien.fr')
(6791, 'www.leparisien.fr', 'connect.leparisien.fr')
(6830, 'www.fnac.com', 'secure.fnac.com')
(6893, 'www.societegenerale.fr', 'm.particuliers.societegenerale.fr')
(6896, 'www.societegenerale.fr', '3qv7.la1-c1-frf.salesforceliveagent.com')
(6912, 'www.bouyguestelecom.fr', 'www.mon-compte.bouyguestelecom.fr')
(6914, 'www.bouyguestelecom.fr', 'www.assistance.bouyguestelecom.fr')
(6916, 'www.bouyguestelecom.fr', 'forum.bouyguestelecom.fr')
(6945, 'www.laposte.net', 'compte.laposte.net')
(6955, 'www.ok.ru', 'm.ok.ru')
(696, 'ign.com', 'in.ign.com')
(6973, 'www.ok.ru', 'm.ok.ru')
(6979, 'www.ok.ru', 'm.ok.ru')
(6999, 'www.ouest-france.fr', 'www.ouestfrance-immo.com')
(7, 'youtube.com', 'm.youtube.com')
(7002, 'www.ouest-france.fr', 'www.ouestfrance-immo.com')
(7006, 'www.ouest-france.fr', 'www.ouestfrance-immo.com')
(7012, 'www.deezer.com', 'support.deezer.com')
(7041, 'youtube.com', 'https:')
(707, 'myatt.com', 'myatt.com or http')
(7071, 'www.leroymerlin.fr', 'communaute.leroymerlin.fr')
(7099, 'www.libertyland.co', 'libertyland.co')
(7118, 'www.libertyland.co', 'libertyland.co')
(7119, 'www.mabanque.bnpparibas', 'mabanque.bnpparibas')
(7126, 'www.mabanque.bnpparibas', 'mabanque.bnpparibas')
(7127, 'www.mabanque.bnpparibas', 'mabanque.bnpparibas')
(7186, 'www.liberation,fr', 'www.liberation.fr')
(7188, 'www.vimeo.com', 'vimeo.com')
(7289, 'www.google.com', 'google.com')
(7290, 'www.google.com', 'google.com')
(7292, 'www.google.com', 'google.com')
(7293, 'www.google.com', 'google.com')
(7296, 'www.google.com', 'google.com')
(7298, 'www.google.com', 'google.com')
(7299, 'www.google.com', 'google.com')
(7304, 'www.google.com', 'google.com')
(7305, 'www.google.com', 'google.com')
(7309, 'www.google.com', 'google.com')
(7311, 'www.google.com', 'google.com')
(7319, 'www.google.com', 'google.com')
(7323, 'www.google.com', 'google.com')
(7356, 'www.google.com', 'google.com')
(74, 'www.fresno.courts.ca.gov', '')
(7409, 'www.hotstart.com', 'www.hotstar.com')
(7422, 'www.ntd.tv', 'mb.ntd.tv')
(7424, 'www.ntd.tv', 'mb.ntd.tv')
(7441, 'www.torrentz2.eu', 'torrentz2.eu')
(7443, 'www.ndtv.com', 'm.ndtv.com')
(7451, 'www.ndtv.com', 'm.ndtv.com')
(7468, 'www.ndtv.com', 'auto.ndtv.com')
(7470, 'www.rediff.com', 'm.rediff.com')
(7479, 'www.rediff.com', 'labs.rediff.com')
(7480, 'www.rediff.com', 'labs.rediff.com')
(7481, 'www.rediff.com', 'register.rediff.com')
(7482, 'www.rediff.com', 'ishare.rediff.com')
(7514, 'www.rediff.com', 'zarabol.rediff.com')
(7516, 'www.rediff.com', 'm.rediff.com')
(7522, 'www.rediff.com', 'mypage.rediff.com')
(7585, 'www.moneycontrol.com', 'm.moneycontrol.com')
(7587, 'www.moneycontrol.com', 'm.moneycontrol.com')
(7593, 'www.moneycontrol.com', 'm.moneycontrol.com')
(7594, 'www.snapdeal.com', 'm.snapdeal.com')
(7615, 'www.msn.com-', 'www.msn.com')
(7639, 'www.makemytrip.com', 'holidayz.makemytrip.com')
(7647, 'www.justdial.com', 't.justdial.com')
(7648, 'www.justdial.com', 't.justdial.com')
(7653, 'www.justdial.com', 't.justdial.com')
(7749, 'www.justdial.com', 't.justdial.com')
(7752, 'www.softonic.com', 'features.en.softonic.com')
(7757, 'www.indianexpress.com', 'indianexpress.com')
(7758, 'www.indianexpress.com', 'indianexpress.com')
(7795, 'www.indianexpress.com', 'indianexpress.com')
(7796, 'www.indianexpress.com', 'indianexpress.com')
(78, 'comptoir-hardware.com', 'www.comptoir-hardware.com')
(7803, 'www.xhamster.com', 'm.xhamster.com')
(7804, 'www.shopclues.com', 'm.shopclues.com')
(7858, 'www.oneindia.com', 'recharge.oneindia.com')
(7902, 'www.filehippo.com', 'filehippo.com')
(7904, 'www.indiamart.com', 'm.indiamart.com')
(7909, 'www.indiamart.com', 'm.indiamart.com')
(81, 'citroen.ru', 'www.citroen.ru')
(86, 'outlook.com', 'www.outlook.com')
(89, 'ovh.com', 'www.ovh.com')
(900, 'tastebuds.fm', 'tastebuds.fm and naukri.com')
(918, 'deceeeu.ro', 'deceeu.ro')
(955, 'tastebuds.fm', 'tastebuds.fm and naukri.com , webcompat')
(964, 'www.weibo.com.com', 'www.weibo.com')

karlcow · 2017-08-31T02:31:21Z

Recording here so it's not lost.
@denschub yesterday was suggesting that we provide only for 2nd level domain name. To maximize the outreach for web developers of one company.

This could be done, aka instead of:

/feed/www.example.org
/feed/lab.example.org

we provide only

/feed/example.org

I personally prefer the more granular version for different reasons, but I think we could do both. Some of my reasons:

foo.tumblr.com != bar.tumblr.com in some cases individual contributor choices.
google.fr and google.ro Grouping domain names is sometimes difficult. 2nd level domain names will not catch those.
Sometimes a local version or specific project is handled by a complete different team or even company.
Companies know usually which domains they want to track.

karlcow · 2018-02-16T06:06:43Z

Let me kill this with fire. :) And let's revive it once/if one day we have a DB with issues.

karlcow added the site-development label Jun 3, 2014

karlcow mentioned this issue Nov 13, 2014

Domain names feed for getting notifications of new issues #60

Closed

miketaylr modified the milestone: Allow site owners/devs to subscribe to a feed of issues for their domain Dec 3, 2014

miketaylr added the status: ready label Sep 30, 2015

miketaylr self-assigned this Sep 30, 2015

miketaylr assigned karlcow and unassigned miketaylr Oct 17, 2016

miketaylr closed this as completed Oct 17, 2016

miketaylr changed the title ~~Access issues by domain names~~ Access issues by domain names (RSS feed) Oct 17, 2016

miketaylr changed the title ~~Access issues by domain names (RSS feed)~~ Access issues by domain names (Atom feed) Oct 17, 2016

miketaylr reopened this Oct 17, 2016

miketaylr mentioned this issue Jan 19, 2017

How to handle "out of scope" bugs? aka non-compatibility issues. #1232

Closed

karlcow added type: feature status: accepted status: needs tests type: feature request labels Mar 30, 2017

miketaylr modified the milestones: SF work week, Allow site owners/devs to subscribe to a feed of issues for their domain Apr 25, 2017

karlcow added the lang: Python label May 17, 2017

karlcow added in progress ready and removed in progress labels May 17, 2017

karlcow added a commit to karlcow/webcompat.com that referenced this issue Aug 16, 2017

Issue webcompat#132 - Adds test for URL feeds.

17f2a1b

karlcow added a commit to karlcow/webcompat.com that referenced this issue Aug 16, 2017

Issue webcompat#132 - Fixes missing import

f526d24

karlcow added a commit to karlcow/webcompat.com that referenced this issue Aug 16, 2017

Issue webcompat#132 - Adds test for feed/ homepage

32cc4e4

karlcow added a commit to karlcow/webcompat.com that referenced this issue Aug 16, 2017

Issue webcompat#132 - Improves a couple of style coding issues

a66084c

- pep257 - orders of import - ignore webcompat.views check

karlcow added a commit to karlcow/webcompat.com that referenced this issue Aug 16, 2017

Issue webcompat#132 - Creates shell for the feed code

a5646c6

- Adds a /feed Blueprint - Prepares for the main request feed function

karlcow added a commit to karlcow/webcompat.com that referenced this issue Aug 16, 2017

Issue webcompat#132 - Adds home page shell for feeds/

00069e1

- Fixes tests for feeds/ home page - Creates shells for prose - Defines routes for feeds/

karlcow mentioned this issue Aug 17, 2017

[proposal] Adds a data/ directory under .gitignore #1586

Closed

5 tasks

karlcow added a commit to karlcow/webcompat.com that referenced this issue Aug 17, 2017

Issue webcompat#132 - Adds test for static files feed helpers

ef0a5bc

karlcow added a commit to karlcow/webcompat.com that referenced this issue Aug 17, 2017

Issue webcompat#132 - Adds check for feed static file

013c099

karlcow added a commit to karlcow/webcompat.com that referenced this issue Aug 24, 2017

Issue webcompat#132 - Adds tests and code outline

37f13ef

This was referenced Aug 30, 2017

Needs a convention for removing spam URLs #1738

Open

Test URL before click report #1260

Open

zoepage removed the status: accepted label Dec 18, 2017

karlcow closed this as completed Feb 16, 2018

karlcow added the prio: backlog label Feb 16, 2018

miketaylr mentioned this issue Aug 8, 2018

[feature] Service for duplicate domain names detection on webcompat.com #2545

Open

karlcow mentioned this issue Mar 9, 2020

Ability to be alerted for a particular domain #3224

Open

spirillen mentioned this issue Feb 7, 2025

menshealth.com mypdns/matrix#79629

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Access issues by domain names (Atom feed) #132

Access issues by domain names (Atom feed) #132

karlcow commented Jun 3, 2014 •

edited

Loading

miketaylr commented Sep 30, 2015

hallvors commented Oct 29, 2015

miketaylr commented Oct 17, 2016

miketaylr commented Oct 17, 2016

karlcow commented Mar 30, 2017

karlcow commented Mar 30, 2017 •

edited

Loading

karlcow commented May 17, 2017 •

edited

Loading

karlcow commented May 18, 2017

karlcow commented Aug 16, 2017

karlcow commented Aug 16, 2017

karlcow commented Aug 17, 2017

karlcow commented Aug 17, 2017

karlcow commented Aug 30, 2017

karlcow commented Aug 30, 2017 •

edited

Loading

karlcow commented Aug 30, 2017

karlcow commented Aug 31, 2017

karlcow commented Feb 16, 2018

Access issues by domain names (Atom feed) #132

Access issues by domain names (Atom feed) #132

Comments

karlcow commented Jun 3, 2014 • edited Loading

miketaylr commented Sep 30, 2015

hallvors commented Oct 29, 2015

miketaylr commented Oct 17, 2016

miketaylr commented Oct 17, 2016

karlcow commented Mar 30, 2017

karlcow commented Mar 30, 2017 • edited Loading

karlcow commented May 17, 2017 • edited Loading

karlcow commented May 18, 2017

karlcow commented Aug 16, 2017

karlcow commented Aug 16, 2017

karlcow commented Aug 17, 2017

karlcow commented Aug 17, 2017

karlcow commented Aug 30, 2017

karlcow commented Aug 30, 2017 • edited Loading

karlcow commented Aug 30, 2017

karlcow commented Aug 31, 2017

karlcow commented Feb 16, 2018

karlcow commented Jun 3, 2014 •

edited

Loading

karlcow commented Mar 30, 2017 •

edited

Loading

karlcow commented May 17, 2017 •

edited

Loading

karlcow commented Aug 30, 2017 •

edited

Loading