Module: XPostSanitizer

Defined in:
lib/x_post_sanitizer.rb,
lib/x_post_sanitizer/version.rb

Defined Under Namespace

Classes: Error

Constant Summary collapse

VERSION =
"0.1.0"

Class Method Summary collapse

Class Method Details

.expand_urls_in_text(tweet, text) ⇒ String

Parameters:

  • tweet (Hash<String, Object>)

    Tweet object

  • text (String)

Returns:

  • (String)

See Also:



36
37
38
39
40
41
42
43
44
45
46
# File 'lib/x_post_sanitizer.rb', line 36

def self.expand_urls_in_text(tweet, text)
  urls = tweet.dig("entities", "urls")

  return text unless urls

  urls.reverse.each_with_object(text.dup) do |url, expanded|
    pos1 = url.dig("indices", 0)
    pos2 = url.dig("indices", 1)
    expanded[pos1, pos2-pos1] = url["expanded_url"] if url["expanded_url"] && pos1 && pos2
  end
end

.remove_media_urls_in_tweet(tweet, text) ⇒ String

Parameters:

  • tweet (Hash<String, Object>)

    Tweet object

  • text (String)

Returns:

  • (String)

See Also:



63
64
65
66
67
68
69
70
71
72
# File 'lib/x_post_sanitizer.rb', line 63

def self.remove_media_urls_in_tweet(tweet, text)
  medias = get_medias(tweet)

  return text if medias.empty?

  medias.each_with_object(text.dup) do |media, t|
    t.gsub!(media["url"], "")
    t.strip!
  end
end

.sanitize_text(tweet, use_retweeted_tweet: true, expand_url: true, remove_media_url: true, unescape: true) ⇒ String

Sanitize X Post (formerly Twitter Tweet)

Parameters:

  • tweet (Hash<String, Object>)

    Tweet object

  • use_retweeted_tweet (Boolean) (defaults to: true)

    Use original retweeted tweet if exists

  • expand_url (Boolean) (defaults to: true)

    Whether expand url in tweet (e.g. t.co url -> original url)

  • remove_media_url (Boolean) (defaults to: true)

    Whether remove media url in tweet

  • unescape (Boolean) (defaults to: true)

    Whether unescape in tweet (e.g. (&gt; &lt;) -> (> <))

Returns:

  • (String)

    Sanitized text in tweet

See Also:



19
20
21
22
23
24
25
26
27
28
# File 'lib/x_post_sanitizer.rb', line 19

def self.sanitize_text(tweet, use_retweeted_tweet: true, expand_url: true, remove_media_url: true, unescape: true)
  # Original RT status exists in retweeted_status
  tweet = tweet["retweeted_status"] if use_retweeted_tweet && tweet["retweeted_status"]

  text = tweet_full_text(tweet)
  text = expand_urls_in_text(tweet, text) if expand_url
  text = remove_media_urls_in_tweet(tweet, text) if remove_media_url
  text = CGI.unescapeHTML(text) if unescape
  text
end

.tweet_full_text(tweet) ⇒ String

Returns full_text attribute if exist.

Parameters:

  • tweet (Hash<String, Object>)

    Tweet object

Returns:

  • (String)

    full_text attribute if exist

See Also:



53
54
55
# File 'lib/x_post_sanitizer.rb', line 53

def self.tweet_full_text(tweet)
  tweet["full_text"] || tweet["text"]
end