class CHECKING::YOU

I'm not trying to be an exact clone of `shared-mime-info`, but I think its “Recommended checking order” is pretty sane: specifications.freedesktop.org/shared-mime-info-spec/latest/

In addition to the above, CYO() supports IETF-style Media Type strings like “application/xhtml+xml” and supports `stat`-less testing of `.extname`-style Strings.

Constants

CLASS_NEEDLEMAKER

The following two `proc`s handle classwide-memoization and instance-level assignment for values that may be Enumerable but often refer to only a single Object.

For example, most `Postfix`es (file extensions) will only ever belong to a single CYO Object, but a handful represent possibly-multiple types, like how `.doc` can be an MSWord file or WordPad RTF.

These assignment procs take a storage haystack, a needle to store, and the CYO receiver to which the needle refers. They will set `haystack => CYO` if that needle is unique and unset, or they will convert an existing single `haystack => CYO` assignment to `haystack => Set[existingCYO, newCYO]`.

This is an admittedly-annoying complexity-for-performance tradeoff with the goal of allocating as few spurious containers as possible instead of explicitly initializing a Set for every needle when most of them would wastefully be a Set of just a single thing.

INSTANCE_NEEDLEMAKER

This is the instance-level version of the above, e.g. a CYO with only one Postfix will assign `cyo.:@postfixes = Postfix`, and a CYO with many Postfixes will assign e.g. `cyo.:@postfixes = Set[post, fix, es, …]`.

LEGENDARY_HEAVY_GLOW

Extract the heaviest member(s) from an Enumerable of weighted keys.

StickAround

Provide case-optional String-like keys for Postfixes, Globs, etc.

From Ruby's `Hash` docs: “Two objects refer to the same hash key when their hash value is identical and the two objects are eql? to each other” I tried to subclass String and just override `:eql?` and `:hash` for case-insensitive lookups, but it turns out not be that easy due to MRI's C comparison functions for String, Symbol, etc.

It was super-confusing because I could call e.g. `'DOC'.eql? 'doc'` manually and get `true`, but it would always fail to work when used as a `Hash` key, when calling `uniq`, or in a `Set`:

irb(main):049:1* Lol = Class.new(String).tap { irb(main):050:1* _1.define_method(:hash) do; self.downcase!.hash; end; irb(main):051:1* _1.define_method(:eql?) do |lol|; self.casecmp?(lol); end; irb(main):052:1* _1.alias_method(:==, :eql?) irb(main):053:0> } irb(main):054:0> fart = Lol.new(“abcdefg”) irb(main):055:0> butt = Lol.new(“abcdefgh”) irb(main):056:0> fart == butt

> true

irb(main):057:0> fart.eql? butt

> true

irb(main):058:0> fart.hash

> 1243221847611081438

irb(main):059:0> butt.hash

> 1243221847611081438

irb(main):060:0> fart => “smella”

> nil

irb(main):061:0> fart => “smella”

> “smella”

I'm not the first to run into this, as I found when searching for `“rb_str_hash_cmp”`: kate.io/blog/strange-hash-instances-in-ruby/

To work around this I will explicitly `downcase` the actual String subclass' value and just let the hashes collide for differently-cased values, then `eql?` will decide. This is still slower than the all-C String code but is the fastest method I've found to achieve this without doubling my Object allocations by wrapping each String in a Struct.

TEST_EXTANT_PATHNAME

Test a Pathname representing an extant file whose contents and metadata we can use. This is separated into a lambda due to the complexity, since the entry-point might be given a String that could represent a Media Type, a hypothetical path, an extant path, or even raw stream contents. It could be given a Pathname representing either a hypothetical or extant file. It could be given an IO/Stream object. Several input possibilities will end up callin this lambda.

Some of this complexity is my fault, since I'm doing a lot of variable juggling to avoid as many new-Object-allocations as possible in the name of performance since this library is the very core-est core of DistorteD; things like assigning Hash values to single CYO objects the first time that key is stored then replacing that value with a Set iff that key needs to reference any additional CYO.

  • `::from_xattr` can return `nil` or a single `CYO` depending on filesystem extended attributes. It is very very unlikely that most people will ever use this, but I think it's cool 8)

  • `::from_postfix` can return `nil`, `CYO`, or `Set` since I decided to store Postfixes separately from freeform globs since file-extension matches are the vast majority of globs. Postfixes avoid needing to be weighted since they all represent the same final pathname component and should never result in multiple conflicting Postfix key matches. A single Postfix key can represent multiple CYOs, though; hence the possible `Set`.

  • `::from_glob` can return `nil` or `Hash` since even a single match will include the weighted key.

  • `::from_content` can return `nil` or `Hash` based on a `libmagic`-style match of file/stream contents. Many common types can be determined from the first four bytes alone, but we support matching arbitrarily-long sequences against arbitrarily-big byte range boundaries. These keys will also be weighted, even for a single match.

Public Class Methods

OUT(unknown_identifier, so_deep: true) click to toggle source
# File lib/checking-you-out.rb, line 19
def self.OUT(unknown_identifier, so_deep: true)
  case unknown_identifier
  when ::Pathname
    TEST_EXTANT_PATHNAME.call(unknown_identifier)
  when ::String
    case
    when unknown_identifier.count(-?/) == 1 then  # TODO: Additional String validation here.
      ::CHECKING::YOU::OUT::from_ietf_media_type(unknown_identifier)
    when unknown_identifier.start_with?(-?.) && unknown_identifier.count(-?.) == 1 then
      ::CHECKING::YOU::OUT::from_pathname(unknown_identifier)
    else
      if File::exist?(File::expand_path(unknown_identifier)) and so_deep then
        TEST_EXTANT_PATHNAME.call(Pathname.new(File::expand_path(unknown_identifier)))
      else
        LEGENDARY_HEAVY_GLOW.call(::CHECKING::YOU::OUT::from_glob(unknown_identifier), :weight) || ::CHECKING::YOU::OUT::from_postfix(unknown_identifier)
      end
    end
  when ::CHECKING::YOU::IN
    unknown_identifier.out
  end
end