Friday, February 4, 2011

Why do ruby setters need "self." qualification within the class?

Ruby setters -- whether created by (c)attr_accessor or manually -- seem to be the only methods that need "self." qualification when accessed within the class itself. This seems to put Ruby alone the world of languages:

  • all methods need self/this (like perl, and I think Javascript)
  • no methods require self/this is (C#, Java)
  • only setters need self/this (ruby??)

The best comparison is C# vs ruby, because both languages support accessor methods which work syntactically just like class instance variables: foo.x = y, y = foo.x . C# calls them properties.

Here's a simple example; the same program in ruby then C#


class A
  def qwerty; @q; end                   # manual getter
  def qwerty=(value); @q = value; end   # manual setter, but attr_accessor is same 
  def asdf; self.qwerty = 4; end        # "self." is necessary in ruby?
  def xxx; asdf; end                    # we can invoke nonsetters w/o "self."
  def dump; puts "qwerty = #{qwerty}"; end

a =

take away the self.qwerty =() and it fails (ruby 1.8.6 on linux & osx). Now C#,

using System;

public class A {
  public A() {}
  int q;
  public int qwerty {
    get { return q; }
    set { q = value; }
  public void asdf() { qwerty = 4; } // C# setters work w/o "this."
  public void xxx()  { asdf(); }     // are just like other methods
  public void dump() { Console.WriteLine("qwerty = {0}", qwerty); }

public class Test {
  public static void Main() {
    A a = new A();;

Question: Is this true? Are there other occasions besides setters where self is necessary?


Thanks all for the feedback. First let me be more precise about the concluding question.

Question at Bottom: Are there other occasions where a ruby method cannot be invoked without self?

I agree, there are lots of cases where self becomes necessary. This is not unique to ruby, just to be clear:

using System;

public class A {
  public A() {}
  public int test { get { return 4; }}
  public int useVariable() {
    int test = 5;
    return test;
  public int useMethod() {
    int test = 5;
    return this.test;

public class Test {
  public static void Main() {
    A a = new A();
    Console.WriteLine("{0}", a.useVariable()); // prints 5
    Console.WriteLine("{0}", a.useMethod());   // prints 4

Same ambiguity is resolved in same way. But while subtle I'm asking about the case where

  • A method has been defined, and
  • No local variable has been defined, and

we encounter

qwerty = 4

Ambiguity: Is this a method invocation or an new local variable assignment?

The title question as to why ruby always treats this as an assignment, is perhaps best answered by ben. Let me paraphrase

Summary: The parser could treat "symbol =" as an lvalue and dynamically decide between assignment and invocation. The dynamic nature of ruby means every assignment potentially faces this ambiguity, so in the interest of performance, ruby treats this as assignment always. C# benefits from knowing what all the methods are, and treats this case the opposite way (as a method invocation).

  • Well, I think the reason this is the case is because "qwerty = 4" is ambiguous... are you defining a new variable called "qwerty" or calling the setter? Ruby resolves this ambiguity by saying it will create a new variable, thus the "self." is required.

    Here is another case where you need "self.":

    class A
      def test
      def use_variable
        test = 5
      def use_method
        test = 5
    a =
    a.use_variable # returns 5
    a.use_method   # returns 4

    As you can see, the access to "test" is ambiguous, so the "self." is required.

    EDIT: Also, this is why the C# example is actually not a good comparison, because you define variables in a way that is unambiguous from using the setter... if you had defined a variable in C# that was the same name as the accessor, you would need to qualify calls to the accessor with "this." just like the ruby case.

    From Mike Stone
  • The important thing to remember here is that Ruby methods can be (un)defined at any point, so to intelligently resolve the ambiguity, every assignment would need to run code to check whether there is a method with the assigned-to name at the time of assignment.

    From ben
  • @Purfideas

    My point before about C# not being a good comparison was that it is a completely different case because "qwerty = 4" is UNAMBIGUOUS in C# when there is no variable defined... it has nothing to do with knowing what all the methods are. That simply is not how you define a variable, whereas in ruby that expression alone IS ambiguous (as both variable definition and method invocation).

    I seriously doubt it has anything to do with performance... because consider this for a moment: let's say you WANT to define a variable that has the same name as a setter... how would you do this syntactically if "variable=" always invoked a method, if there is one? The answer is you couldn't unless you introduced a new language construct. However, with how the language ACTUALLY works, there already is a way to both define a variable and invoke the setter. With this consideration in mind, it seems a no-brainer to me to have the ambiguity be resolved by creating the variable... and this isn't even considering the fact that Ruby just was not designed for performance.

    "Why do ruby setters need “self.” qualification within the class?" Because of the ambiguity in the language of variable creation vs method invocation.... C# doesn't have this issue because this ambiguity just does NOT exist... the self/this is required in both languages when the ambiguity exists in both languages. (this is what I was trying to point out before)

    From Mike Stone
  • In your particular test case, there does not appear to be any reason not to use the @ syntax from within the class. Your setter isn't performing any data validation, so there's nothing lost by referring to it with the instance variable syntax.

    You do have to use self for those cases where you'd like to use a data-checking setter, but I have run into very few circumstances where I needed to use a data-checking version of a method from within my object. My checks are typically put in place to prevent invalid values supplied when external objects are using my object's interface.

  • @Mike Stone

    Hi! I understand and appreciate the points you've made and your example was great. Believe me when I say, if I had enough reputation, I'd vote up your response. Yet we still disagree:

    • on a matter of semantics, and
    • on a central point of fact

    First I claim, not without irony, we're having a semantic debate about the meaning of 'ambiguity'.

    When it comes to parsing and programming language semantics (the subject of this question), surely you would admit a broad spectrum of the notion 'ambiguity'. Let's just adopt some random notation:

    1. ambiguous: lexical ambiguity (lex must 'look ahead')
    2. Ambiguous: grammatical ambiguity (yacc must defer to parse-tree analysis)
    3. AMBIGUOUS: ambiguity knowing everything at the moment of execution

    (and there's junk between 2-3 too). All these categories are resolved by gathering more contextual info, looking more and more globally. So when you say,

    "qwerty = 4" is UNAMBIGUOUS in C# when there is no variable defined...

    I couldn't agree more. But by the same token, I'm saying

    "qwerty = 4" is un-Ambiguous in ruby (as it now exists)

    "qwerty = 4" is Ambiguous in C#

    And we're not yet contradicting each other. Finally, here's where we really disagree: Either ruby could or could not be implemented without any further language constructs such that,

    For "qwerty = 4," ruby UNAMBIGUOUSLY invokes an existing setter if there
    is no local variable defined

    You say no. I say yes; another ruby could exist which behaves exactly like the current in every respect, except "qwerty = 4" defines a new variable when no setter and no local exists, it invokes the setter if one exists, and it assigns to the local if one exists. I fully accept that I could be wrong. In fact, a reason why I might be wrong would be interesting.

    Let me explain.

    Imagine you are writing a new OO language with accessor methods looking like instances vars (like ruby & C#). You'd probably start with conceptual grammars something like:

      var = expr    // assignment
      method = expr // setter method invocation

    But the parser-compiler (not even the runtime) will puke, because even after all the input is grokked there's no way to know which grammar is pertinent. You're faced which a classic choice. I can't be sure of the details, but basically ruby does this:

      var = expr    // assignment (new or existing)
      // method = expr, disallow setter method invocation without .

    that is why it's un-Ambiguous, while and C# does this:

      symbol = expr // push 'symbol=' onto parse tree and decide later
                    // if local variable is def'd somewhere in scope: assignment
                    // else if a setter is def'd in scope: invocation

    For C#, 'later' is still at compile time.

    I'm sure ruby could do the same, but 'later' would have to be at runtime, because as ben points out you don't know until the statement is executed which case applies.

    My question was never intended to mean "do I really need the 'self.'?" or "what potential ambiguity is being avoided?" Rather I wanted to know why was this particular choice made? Maybe it's not performance. Maybe it just got the job done, or it was considered best to always allow a 1-liner local to override a method (a pretty rare case requirement) ...

    But I'm sort of suggesting that the most dynamical language might be the one which postpones this decision the longest, and chooses semantics based on the most contextual info: so if you have no local and you defined a setter, it would use the setter. Isn't this why we like ruby, smalltalk, objc, because method invocation is decided at runtime, offering maximum expressiveness?

    From Purfideas


Post a Comment